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1 

COMPOUNDS AND METHODS FOR DIAGNOSIS OF TUBERCULOSIS 

TECHNICAL FIELD 

The present invention relates generally to the detection of Mycobacterium 
5 tuberculosis infection. The invention is more particularly related to polypeptides comprising 
a Mycobacterium tuberculosis antigen, or a portion or other variant thereof, and the use of 
such polypeptides for the serodiagnosis of Mycobacterium tuberculosis infection. 

BACKGROUND OF THE INVENTION 

10 Tuberculosis is a chronic, infectious disease, that is generally caused by 

infection with Mycobacterium tuberculosis. It is a major disease in developing countries, as 
well as an increasing problem in developed areas of the world, with about 8 million new 
cases and 3 million deaths each year. Although the infection may be asymptomatic for a 
considerable period of time, the disease is most commonly manifested as an acute 

1 5 inflammation of the lungs, resulting in fever and a nonproductive cough. If left untreated, 
serious complications and death typically result. 

Although tuberculosis can generally be controlled using extended antibiotic 
therapy, such treatment is not sufficient to prevent the spread of the disease. Infected 
individuals may be asymptomatic, but contagious, for some time. In addition, although 

20 compliance with the treatment regimen is critical, patient behavior is difficult to monitor. 
Some patients do not complete the course of treatment, which can lead to ineffective 
treatment and the development of drug resistance. 

Inhibiting the spread of tuberculosis will require effective vaccination and 
accurate, early diagnosis of the disease. Currently, vaccination with live bacteria is the most 

25 efficient method for inducing protective immunity. The most common Mycobacterium for 
this purpose is Bacillus Calmette-Guerin (BCG), an avirulent strain of Mycobacterium bovis. 
However, the safety and efficacy of BCG is a source of controversy and some countries, such 
as the United States, do not vaccinate the general public. Diagnosis is commonly achieved 
using a skin test, which involves intradermal exposure to tuberculin PPD (protein-purified 

30 derivative). Antigen-specific T cell responses result in measurable incubation at the injection 
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site by 48-72 hours after injection, which indicates exposure to Mycobacterial antigens. 
Sensitivity and specificity have, however, been a problem with this test, and individuals 
vaccinated with BCG cannot be distinguished from infected individuals. 

While macrophages have been shown to act as the principal effectors of 
5 M tuberculosis immunity, T cells are the predominant inducers of such immunity. The 
essential role of T cells in protection against M. tuberculosis infection is illustrated by the 
frequent occurrence of M. tuberculosis in AIDS patients, due to the depletion of CD4 T cells 
associated with human immunodeficiency virus (HIV) infection. Mycobacterium-reactive 
CD4 T cells have been shown to be potent producers of gamma-interferon (IFN-y), which, in 

10 turn, has been shown to trigger the anti-mycobacterial effects of macrophages in mice. While 
the role of IFN-y in humans is less clear, studies have shown that 1,25-dihydroxy-vitamin D3, 
either alone or in combination with IFN-y or tumor necrosis factor-alpha, activates human 
macrophages to inhibit M tuberculosis infection. Furthermore, it is known that IFN-y 
stimulates human macrophages to make 1,25-dihydroxy-vitamin D3. Similarly, IL-12 has 

1 5 been shown to play a role in stimulating resistance to M. tuberculosis infection. For a review 
of the immunology of M tuberculosis infection see Chan and Kaufmann, in Tuberculosis: 
Pathogenesis, Protection and Control, Bloom (ed.), ASM Press, Washington, DC, 1994. 

Accordingly, there is a need in the art for improved diagnostic methods for 
detecting tuberculosis. The present invention fulfills this need and further provides other 

20 related advantages. 

SUMMARY OF THE INVENTION 

Briefly stated, the present invention provides compositions and methods for 
diagnosing tuberculosis. In one aspect, polypeptides are provided comprising an antigenic 
25 portion of a soluble M. tuberculosis antigen, or a variant of such an antigen that differs only 
in conservative substitutions and/or modifications. In one embodiment of this aspect, the 
soluble antigen has one of the following N-terminal sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 
Val-Val- Ala-Ala-Leu (SEQ ID NO: 115); 
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(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQIDNO: 116); 

(c) Ala-Ala-Met-Lys-Pro-Axg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg (SEQ ID NO: 1 17); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 
(SEQIDNO: 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val (SEQ ID 
NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 120); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 
Ser (SEQ ID NO: 121); 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Giu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 
(SEQ ID NO: 122); 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn (SEQ 
ID NO: 123); 

(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala-Ser; 

(SEQIDNO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala-Asp; 

(SEQIDNO: 130) or 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 

(SEQIDNO: 131) 

wherein Xaa may be any amino acid. 

In a related aspect, polypeptides are provided comprising an immunogenic 
portion of an M. tuberculosis antigen, or a variant of such an antigen that differs only in 
conservative substitutions and/or modifications, the antigen having one of the following N- 
terminal sequences: 
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(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys-Ile- 

Asn-Val-His-Leu-Val; (SEQ ID NO: 132) or 
(n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 124) 
wherein Xaa may be any amino acid. 

In another embodiment, the soluble M. tuberculosis antigen comprises an 
amino acid sequence encoded by a DNA sequence selected from the group consisting of the 
sequences recited in SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, the complements of said 
sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID NOS: 1, 2, 
4-10, 13-25, 52, 94 and 96 or a complement thereof under moderately stringent conditions. 

In a related aspect, the polypeptides comprise an antigenic portion of a 
M tuberculosis antigen, or a variant of such an antigen that differs only in conservative 
substitutions and/or modifications, wherein the antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
SEQ ID NOS: 26-51, 133, 134, 158-178 and 196, the complements of said sequences, and 
DNA sequences that hybridize to a sequence recited in SEQ ID NOS: 26-51, 133, 134, 158- 
178 and 196 or a complement thereof under moderately stringent conditions. 

In related aspects, DNA sequences encoding the above polypeptides, 
recombinant expression vectors comprising these DNA sequences and host cells transformed 
or transfected with such expression vectors are also provided. 

In another aspect, the present invention provides fusion proteins comprising a 
first and a second inventive polypeptide or, alternatively, an inventive polypeptide and a 
known M. tuberculosis antigen. 

In further aspects of the subject invention, methods and diagnostic kits are 
provided for detecting tuberculosis in a patient. The methods comprise: (a) contacting a 
biological sample with at least one of the above polypeptides; and (b) detecting in the sample 
the presence of antibodies that bind to the polypeptide or polypeptides, thereby detecting 
M. tuberculosis infection in the biological sample. Suitable biological samples include whole 
blood, sputum, serum, plasma, saliva, cerebrospinal fluid and urine. The diagnostic kits 
comprise one or more of the above polypeptides in combination with a detection reagent. 
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The present invention also provides methods for detecting M. tuberculosis 
infection comprising: (a) obtaining a biological sample from a patient; (b) contacting the 
sample with at least one oligonucleotide primer in a polymerase chain reaction, the 
oligonucleotide primer being specific for a DNA sequence encoding the above polypeptides; 
5 and (c) detecting in the sample a DNA sequence that amplifies in the presence of the first and 
second oligonucleotide primers. In one embodiment, the oligonucleotide primer comprises at 
least about 1 0 contiguous nucleotides of such a DNA sequence. 

In a further aspect, the present invention provides a method for detecting 
Af. tuberculosis infection in a patient comprising: (a) obtaining a biological sample from the 
10 patient; (b) contacting the sample with an oligonucleotide probe specific for a DNA sequence 
encoding the above polypeptides; and (c) detecting in the sample a DNA sequence that 
hybridizes to the oligonucleotide probe. In one embodiment, the oligonucleotide probe 
comprises at least about 15 contiguous nucleotides of such a DNA sequence. 

In yet another aspect, the present invention provides antibodies, both 
15 polyclonal and monoclonal, that bind to the polypeptides described above, as well as methods 
for their use in the detection of M. tuberculosis infection. 

These and other aspects of the present invention will become apparent upon 
reference to the following detailed description and attached drawings. All references 
disclosed herein are hereby incorporated by reference in their entirety as if each was 
20 incorporated individually. 

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE IDENTIFIERS 

Figure 1A and B illustrate the stimulation of proliferation and interferon-y 
production in T cells derived from a first and a second M tuberculosis-immune donor, 
25 respectively, by the 14 Kd, 20 Kd and 26 Kd antigens described in Example 1 . 

Figures 2A-D illustrate the reactivity of antisera raised against secretory M 
tuberculosis proteins, the known M. tuberculosis antigen 85b and the inventive antigens 
Tb38-1 and TbH-9, respectively, with M. tuberculosis lysate (lane 2), M tuberculosis 
secretory proteins (lane 3), recombinant Tb38-1 (lane 4), recombinant TbH-9 (lane 5) and 
30 recombinant 85b (lane 5). 

i 
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Figure 3A illustrates the stimulation of proliferation in a TbH-9-specific T cell 
clone by secretory M tuberculosis proteins, recombinant TbH-9 and a control antigen, 
TbRall. 

Figure 3B illustrates the stimulation of interferon-y production in a TbH-9- 
5 specific T cell clone by secretory M tuberculosis proteins, PPD and recombinant TbH-9. 

Figure 4 illustrates the reactivity of two representative polypeptides with sera 
from M. tuberculosis-infected and uninfected individuals, as compared to the reactivity of 
bacteriallysate. 

Figure 5 shows the reactivity of four representative polypeptides with sera 
10 from M. tuberculosis-infected and uninfected individuals, as compared to the reactivity of the 
38 kD antigen. 

Figure 6 shows the reactivity of recombinant 38 kD and TbRal 1 antigens with 
sera from M. tuberculosis patients, PPD positive donors and normal donors. 

Figure 7 shows the reactivity of the antigen TbRa2A with 38 kD negative sera. 
15 Figure 8 shows the reactivity of the antigen of SEQ ID NO: 60 with sera from 

M tuberculosis patients and normal donors. 

Figure 9 illustrates the reactivity of the recombinant antigen TbH-29 (SEQ ID 
NO: 137) with sera from M tuberculosis patients, PPD positive donors and normal donors as 
determined by indirect ELISA. 
20 Figure 10 illustrates the reactivity of the recombinant antigen TbH-33 (SEQ 

ID NO: 140) with sera from M. tuberculosis patients and from normal donors, and with a pool 
of sera from M tuberculosis patients, as determined both by direct and indirect ELISA 

Figure 11 illustrates the reactivity of increasing concentrations of the 
recombinant antigen TbH-33 (SEQ ID NO: 140) with sera from M. tuberculosis patients and 
25 from normal donors as determined by ELISA. 



SEQ. ID NO. 1 is the DNA sequence of TbRal. 
SEQ. ID NO. 2 is the DNA sequence of TbRal 0. 
SEQ. ID NO. 3 is the DNA sequence of TbRal 1. 
30 SEQ. ID NO. 4 is the DNA sequence of TbRal 2. 



SEQ. ID NO. 5 is the DNA sequence of TbRaB. 
SEQ. ID NO. 6 is the DNA sequence of TbRal6. 
SEQ. ID NO. 7 is the DNA sequence of TbRal7. 
SEQ. ID NO. 8 is the DNA sequence of TbRal8. 
SEQ. ID NO. 9 is the DNA sequence of TbRal9. 
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SEQ. ID NO. 35 is the DNA sequence of TbM-3. 

SEQ. ID NO. 36 is the DNA sequence of TbM-6. 

SEQ. ID NO. 37 is the DNA sequence of TbM-7. 

SEQ. ID NO. 38 is the DNA sequence of TbM-9. 
5 SEQ. ID NO. 39 is the DNA sequence of TbM-12. 

SEQ. ID NO. 40 is the DNA sequence of TbM-13. 

SEQ. ID NO. 41 is the DNA sequence of TbM-14. 

SEQ. ID NO. 42 is the DNA sequence of TbM- 15. 

SEQ. ID NO. 43 is the DNA sequence of TbH-4. 
10 SEQ. ID NO. 44 is the DNA sequence of TbH-4-FWD. 

SEQ. ID NO. 45 is the DNA sequence of TbH-12. 

SEQ. ID NO. 46 is the DNA sequence of Tb38-1 . 

SEQ. ID NO. 47 is the DNA sequence of Tb38-4. 

SEQ. ID NO. 48 is the DNA sequence of TbL-17. 
1 5 SEQ. ID NO. 49 is the DNA sequence of TbL-20. 

SEQ. ID NO. 50 is the DNA sequence of TbL-21. 

SEQ. ID NO. 5 1 is the DNA sequence of TbH-16. 

SEQ. ID NO. 52 is the DNA sequence of DPEP. 

SEQ. ID NO. 53 is the deduced amino acid sequence of DPEP. 
20 SEQ. ID NO. 54 is the protein sequence of DPV N-terminal Antigen. 

SEQ. ID NO. 55 is the protein sequence of AVGS N-terminal Antigen. 

SEQ. ID NO. 56 is the protein sequence of AAMK N-terminal Antigen. 

SEQ. ID NO. 57 is the protein sequence of YYWC N-terminal Antigen. 

SEQ. ID NO. 58 is the protein sequence of DIGS N-terminal Antigen. 
25 SEQ. ID NO. 59 is the protein sequence of AEES N-terminal Antigen. 

SEQ. ID NO. 60 is the protein sequence of DPEP N-terminal Antigen. 

SEQ. ID NO. 61 is the protein sequence of APKT N-terminal Antigen. 

SEQ. ID NO. 62 is the protein sequence of DPAS N-terminal Antigen. 

SEQ. ID NO. 63 is the deduced amino acid sequence of TbM- 1 Peptide. 
30 SEQ. ID NO. 64 is the deduced amino acid sequence of TbRal . 
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SEQ. ID NO. 95 is the deduced amino acid sequence of DPAS. 

SEQ. ID NO. 96 is the DNA sequence of DPV. 

SEQ. ID NO. 97 is the deduced amino acid sequence of DPV. 

SEQ. ID NO. 98 is the DNA sequence of ESAT-6. 

SEQ. ID NO. 99 is the deduced amino acid sequence of ESAT-6. 

SEQ. ID NO. 100 is the DNA sequence of TbH-8-2. 

SEQ. ID NO. 101 is the DNA sequence of TbH-9FL. 

SEQ. ID NO. 102 is the deduced amino acid sequence of TbH-9FL. 

SEQ. ID NO. 103 is the DNA sequence of TbH-9-1. 

SEQ. ID NO. 104 is the deduced amino acid sequence of TbH-9-1. 

SEQ. ID NO. 105 is the DNA sequence of TbH-9-4. 

SEQ. ID NO. 106 is the deduced amino acid sequence of TbH-9-4. 

SEQ. ID NO. 107 is the DNA sequence of Tb38-1F2 IN. 

SEQ. ID NO. 108 is the DNA sequence of Tb38-1F2 RP. 

SEQ. ID NO. 109 is the deduced amino acid sequence of Tb37-FL. 

SEQ. ID NO. 1 10 is the deduced amino acid sequence of Tb38-IN. 

SEQ. ID NO. 1 1 1 is the DNA sequence of Tb38-1F3. 

SEQ. ID NO. 1 12 is the deduced amino acid sequence of Tb38-1F3. 

SEQ. ID NO. 1 13 is the DNA sequence of Tb38-1F5. 

SEQ. ID NO. 1 14 is the DNA sequence of Tb38-1F6. 

SEQ. ID NO. 1 15 is the deduced N-terminal amino acid sequence of DPV. 

SEQ. ID NO. 1 16 is the deduced N-terminal amino acid sequence of AVGS. 

SEQ. ID NO. 1 17 is the deduced N-terminal amino acid sequence of AAMK. 

SEQ. ID NO. 1 18 is the deduced N-terminal amino acid sequence of YYWC. 

SEQ. ID NO. 1 19 is the deduced N-terminal amino acid sequence of DIGS. 

SEQ. ID NO. 120 is the deduced N-terminal amino acid sequence of AAES. 

SEQ. ID NO. 121 is the deduced N-terminal amino acid sequence of DPEP. 

SEQ. ID NO. 122 is the deduced N-terminal amino acid sequence of APKT. 

SEQ. ID NO. 123 is the deduced N-terminal amino acid sequence of DPAS. 

SEQ. ID NO. 124 is the protein sequence of DPPD N-terminal Antigen. 
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SEQ ID NO. 125-128 are the protein sequences of four DPPD cyanogen bromide 
fragments. 



SEQ ID NO. 129 
SEQ ID NO. 130 
SEQ ID NO. 131 
SEQ ID NO. 132 
SEQ ID NO. 133 
SEQ ID NO. 134 
SEQ ID NO. 135 
SEQ ID NO. 136 
SEQ ID NO. 137 
SEQ ID NO. 138 
SEQ ID NO. 139 
SEQ ID NO. 140 



30 



is the N-terminal protein sequence of XDS antigen, 
is the N-terminal protein sequence of AGD antigen, 
is the N-terminal protein sequence of APE antigen, 
is the N-terminal protein sequence of XYI antigen, 
is the DNA sequence of TbH-29. 
is the DNA sequence of TbH-30. 
is the DNA sequence of TbH-32. 
is the DNA sequence of TbH-33. 
is the predicted amino acid sequence of TbH-29. 
is the predicted amino acid sequence of TbH-30. 
is the predicted amino acid sequence of TbH-32. 
is the predicted amino acid sequence of TbH-33. 
SEQ ID NO: 141-146 are PCR primers used in the preparation of a fusion protein 
containing TbRa3, 38 kD and Tb38-1. 

SEQ ID NO: 147 is the DNA sequence of the fusion protein containing TbRa3, 38 kD 
andTb38-l. 

SEQ ID NO: 148 is the amino acid sequence of the fusion protein containing TbRa3, 
38kDandTb38-l. 

SEQ ID NO: 149 is the DNA sequence of the M. tuberculosis antigen 38 kD. 

SEQ ID NO: 150 is the amino acid sequence of the M. tuberculosis antigen 38 kD. 

SEQ ID NO: 151 is the DNA sequence of XP14. 

SEQ ID NO: 152 is the DNA sequence of XP24. 

SEQ ID NO: 1 53 is the DNA sequence of XP3 1 . 

SEQ ID NO: 154 is the 5' DNA sequence of XP32. 

SEQ ID NO: 155 is the 3' DNA sequence of XP32. 

SEQ ID NO: 156 is the predicted amino acid sequence of XP14. 

SEQ ID NO: 157 is the predicted amino acid sequence encoded by the reverse 

complement of XP14. 
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SEQ ID NO: 1 58 is the DNA sequence of XP27. 

SEQ ID NO: 1 59 is the DNA sequence of XP36. 

SEQ ID NO: 160 is the 5' DNA sequence of XP4. 

SEQ ID NO: 161 is the 5' DNA sequence of XP5. 
5 SEQ ID NO: 1 62 is the 5' DNA sequence of XP 1 7. 

SEQ ID NO: 163 is the 5' DNA sequence of XP30. 

SEQ ID NO: 164 is the 5' DNA sequence of XP2. 

SEQ ID NO: 165 is the 3' DNA sequence of XP2. 

SEQ ID NO: 166 is the 5' DNA sequence of XP3. 
10 SEQ ID NO: 167 is the 3' DNA sequence of XP3. 

SEQ ID NO: 168 is the 5' DNA sequence of XP6. 

SEQ ID NO: 169 is the 3' DNA sequence of XP6. 

SEQ ID NO: 170 is the 5' DNA sequence of XP18. 

SEQ ID NO: 171 is the 3' DNA sequence of XP18. 
15 SEQ ID NO: 1 72 is the 5' DNA sequence of XP19. 

SEQ ID NO: 173 is the 3' DNA sequence of XP19. 

SEQ ID NO: 1 74 is the 5' DNA sequence of XP22. 

SEQ ID NO: 1 75 is the 3' DNA sequence of XP22. 

SEQ ID NO: 176 is the 5' DNA sequence of XP25. 
20 SEQ ID NO: 177 is the 3' DNA sequence of XP25. 

SEQ ID NO: 1 78 is the full-length DNA sequence of TbH4-XP 1 . 

SEQ ID NO: 179 is the predicted amino acid sequence of TbH4-XPl. 

SEQ ID NO: 180 is the predicted amino acid sequence encoded by the reverse 

complement of TbH4-XPl . 
25 SEQ ID NO: 181 is a first predicted amino acid sequence encoded by XP36. 

SEQ ID NO: 182 is a second predicted amino acid sequence encoded by XP36. 

SEQ ID NO: 183 is the predicted amino acid sequence encoded by the reverse 

complement of XP36. 

SEQ ID NO: 1 84 is the DNA sequence of RDIF2. 
30 SEQ ID NO: 185 is the DNA sequence of RDIF5. 
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containing TbRa3, 38 kD, Tb38-1 and DPEP (hereinafter referred to as TbF-2). 
SEQ ID NO: 208 is the DNA sequence of the fusion protein TbF-2. 
SEQ ID NO: 209 is the amino acid sequence of the fusion protein TbF-2. 



DETAILED DESCRIPTION OF THE INVENTION 

As noted above, the present invention is generally directed to compositions 
and methods for diagnosing tuberculosis. The compositions of the subject invention include 
polypeptides that comprise at least one antigenic portion of a M. tuberculosis antigen, or a 
variant of such an antigen that differs only in conservative substitutions and/or modifications. 
Polypeptides within the scope of the present invention include, but are not limited to, soluble 
M. tuberculosis antigens. A "soluble M tuberculosis antigen" is a protein of M. tuberculosis 
origin that is present in M. tuberculosis culture filtrate. As used herein, the term 
"polypeptide" encompasses amino acid chains of any length, including full length proteins 
{i.e., antigens), wherein the amino acid residues are linked by covalent peptide bonds. Thus, 
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a polypeptide comprising an antigenic portion of one of the above antigens may consist 
entirely of the antigenic portion, or may contain additional sequences. The additional 
sequences may be derived from the native M tuberculosis antigen or may be heterologous, 
and such sequences may (but need not) be antigenic. 
5 An "antigenic portion" of an antigen (which may or may not be soluble) is a 

portion that is capable of reacting with sera obtained from an M. tuberculosis-infected 
individual (i.e., generates an absorbance reading with sera from infected individuals that is at 
least three standard deviations above the absorbance obtained with sera from uninfected 
individuals, in a representative ELISA assay described herein). An "M. tuberculosis-infected 
10 individual" is a human who has been infected with M. tuberculosis {e.g., has an intradermal 
skin test response to PPD that is at least 0.5 cm in diameter). Infected individuals may 
display symptoms of tuberculosis or may be free of disease symptoms. Polypeptides 
comprising at least an antigenic portion of one or more M. tuberculosis antigens as described 
herein may generally be used, alone or in combination, to detect tuberculosis in a patient. 
15 The compositions and methods of this invention also encompass variants of 

the above polypeptides. A "variant," as used herein, is a polypeptide that differs from the 
native antigen only in conservative substitutions and/or modifications, such that the antigenic 
properties of the polypeptide are retained. Such variants may generally be identified by 
modifying one of the above polypeptide sequences, and evaluating the antigenic properties of 
20 the modified polypeptide using, for example, the representative procedures described herein. 

A "conservative substitution" is one in which an amino acid is substituted for 
another..amijio acid that has similar properties, such that one skilled in the art of peptide 
chemistry would expect the secondary structure and hydropathic nature of the polypeptide to 
be substantially unchanged. In general, the following groups of amino acids represent 
25 conservative changes: (1) ala, pro, gly, glu, asp, gin, asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, 
ile, leu, met, ala, phe; (4) lys, arg, his; and (5) phe, tyr, trp, his. 

Variants may also (or alternatively) be modified by, for example, the deletion 
or addition of amino acids that have minimal influence on the antigenic properties, secondary 
structure and hydropathic nature of the polypeptide. For example, a polypeptide may be 
30 conjugated to a signal (or leader) sequence at the N-terminal end of the protein which co- 
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translationally or post-translationally directs transfer of the protein. The polypeptide may 
also be conjugated to a linker or other sequence for ease of synthesis, purification or 
identification of the polypeptide (e.g., poly-His), or to enhance binding of the polypeptide to a 
solid support. For example, a polypeptide may be conjugated to an immunoglobulin Fc 
5 region. 

In a related aspect, combination polypeptides are disclosed. A "combination 
polypeptide" is a polypeptide comprising at least one of the above antigenic portions and one 
or more additional antigenic M. tuberculosis sequences, which are joined via a peptide 
linkage into a single amino acid chain. The sequences may be joined directly (i.e., with no 

10 intervening amino acids) or may be joined by way of a linker sequence (e.g., Gly-Cys-Gly) 
that does not significantly diminish the antigenic properties of the component polypeptides. 

In general, M. tuberculosis antigens, and DNA sequences encoding such 
antigens, may be prepared using any of a variety of procedures. For example, soluble 
antigens may be isolated from M. tuberculosis culture filtrate by procedures known to those 

15 of ordinary skill in the art, including anion-exchange and reverse phase chromatography. 
Purified antigens may then be evaluated for a desired property, such as the ability to react 
with sera obtained from an M. tuberculosis-infected individual. Such screens may be 
performed using the representative methods described herein. Antigens may then be partially 
sequenced using, for example, traditional Edman chemistry. See Edman and Berg, Eur. J. 

20 Biochem. 80:1 16-132, 1967. 

Antigens may also be produced recombinantly using a DNA sequence that 
encodes the antigen, which has been inserted into an expression vector and expressed in an 
appropriate host. DNA molecules encoding soluble antigens may be isolated by screening an 
appropriate M. tuberculosis expression library with anti-sera (e.g., rabbit) raised specifically 

25 against soluble M. tuberculosis antigens. DNA sequences encoding antigens that may or may 
not be soluble may be identified by screening an appropriate M tuberculosis genomic or 
cDNA expression library with sera obtained from patients infected with M. tuberculosis. 
Such screens may generally be performed using techniques well known in the art, such as 
those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring 

30 Harbor Laboratories, Cold Spring Harbor, NY, 1 989. 
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DNA sequences encoding soluble antigens may also be obtained by screening 
an appropriate M. tuberculosis cDNA or genomic DNA library for DNA sequences that 
hybridize to degenerate oligonucleotides derived from partial amino acid sequences of 
isolated soluble antigens. Degenerate oligonucleotide sequences for use in such a screen may 
5 be designed and synthesized, and the screen may be performed, as described (for example) in 
Sambrook et aL, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratories, Cold Spring Harbor, NY (and references cited therein). Polymerase chain 
reaction (PCR) may also be employed, using the above oligonucleotides in methods well 
known in the art, to isolate a nucleic acid probe from a cDNA or genomic library. The library 

1 0 screen may then be performed using the isolated probe. 

Regardless of the method of preparation, the antigens described herein are 
"antigenic." More specifically, the antigens have the ability to react with sera obtained from 
an M. tuberculosis-infected individual. Reactivity may be evaluated using, for example, the 
representative ELISA assays described herein, where an absorbance reading with sera from 

1 5 infected individuals that is at least three standard deviations above the absorbance obtained 
with sera from uninfected individuals is considered positive. 

Antigenic portions of M. tuberculosis antigens may be prepared and identified 
using well known techniques, such as those summarized in Paul, Fundamental Immunology, 
3d ed., Raven Press, 1993, pp. 243-247 and references cited therein. Such techniques include 

20 screening polypeptide portions of the native antigen for antigenic properties. The 
representative ELISAs described herein may generally be employed in these screens. An 
antigenic portion of a polypeptide is a portion that, within such representative assays, 
generates a signal in such assays that is substantially similar to that generated by the full 
length antigen. In other words, an antigenic portion of a M. tuberculosis antigen generates at 

25 least about 20%, and preferably about 100%, of the signal induced by the full length antigen 
in a model ELISA as described herein. 

Portions and other variants of M. tuberculosis antigens may be generated by 
synthetic or recombinant means. Synthetic polypeptides having fewer than about 100 amino 
acids, and generally fewer than about 50 amino acids, may be generated using techniques 

30 well known in the art. For example, such polypeptides may be synthesized using any of the 
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commercially available solid-phase techniques, such as the Merrifield solid-phase synthesis 
method, where amino acids are sequentially added to a growing amino acid chain. See 
Merrifield, J. Am, Chem. Soc. 55:2149-2146, 1963. Equipment for automated synthesis of 
polypeptides is commercially available from suppliers such as Applied BioSystems, Inc., 
5 Foster City, CA, and may be operated according to the manufacturer's instructions. Variants 
of a native antigen may generally be prepared using standard mutagenesis techniques, such as 
oligonucleoti de-directed site-specific mutagenesis. Sections of the DNA sequence may also 
be removed using standard techniques to permit preparation of truncated polypeptides. 

Recombinant polypeptides containing portions and/or variants of a native 

10 antigen may be readily prepared from a DNA sequence encoding the polypeptide using a 
variety of techniques well known to those of ordinary skill in the art. For example, 
supernatants from suitable host/vector systems which secrete recombinant protein into culture 
media may be first concentrated using a commercially available filter. Following 
concentration, the concentrate may be applied to a suitable purification matrix such as an 

15 affinity matrix or an ion exchange resin. Finally, one or more reverse phase HPLC steps can 
be employed to further purify a recombinant protein. 

Any of a variety of expression vectors known to those of ordinary skill in the 
art may be employed to express recombinant polypeptides as described herein. Expression 
may be achieved in any appropriate host cell that has been transformed or transfected with an 

20 expression vector containing a DNA molecule that encodes a recombinant polypeptide. 
Suitable host cells include prokaryotes, yeast and higher eukaryotic cells. Preferably, the host 
cells employed are E. coli, yeast or a mammalian cell line, such as COS or CHO. The DNA 
sequences expressed in this manner may encode naturally occurring antigens, portions of 
naturally occurring antigens, or other variants thereof. 

25 In general, regardless of the method of preparation, the polypeptides disclosed 

herein are prepared in substantially pure form. Preferably, the polypeptides are at least about 
80% pure, more preferably at least about 90% pure and most preferably at least about 99% 
pure. For use in the methods described herein, however, such substantially pure polypeptides 
may be combined. 
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In certain specific embodiments, the subject invention discloses polypeptides 
comprising at least an antigenic portion of a soluble M. tuberculosis antigen (or a variant of 
such an antigen), where the antigen has one of the following N-terminal sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 
Val-Val-Ala- Ala-Leu (SEQ ID NO: 115); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQ ID NO: 116); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg (SEQ ID NO: 117); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 
(SEQ ID NO: 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val (SEQ ID 
NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 120); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 

Ser (SEQ ID NO: 121); 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thx-Gly 

(SEQ ID NO: 122); 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Gln-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn (SEQ 
ID NO: 123); 

(j ) Xaa-Asp-Ser-Glu-Ly s- Ser- Ala-Thr-Ile-Ly s- Val-Thr- Asp- Ala-Ser; 

(SEQ ID NO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala-Asp; 

(SEQ ID NO: 130) or 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 

(SEQ ID NO: 131) 

wherein Xaa may be any amino acid, preferably a cysteine residue. A DNA sequence 
encoding the antigen identified as (g) above is provided in SEQ ID NO: 52, the deduced 
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amino acid sequence of which is provided in SEQ ID NO: 53. A DNA sequence encoding 
the antigen identified as (a) above is provided in SEQ ID NO: 96; its deduced amino acid 
sequence is provided in SEQ ID NO: 97. A DNA sequence corresponding to antigen (d) 
above is provided in SEQ ID NO: 24, a DNA sequence corresponding to antigen (c) is 
5 provided in SEQ ID NO: 25 and a DNA sequence corresponding to antigen (I) is disclosed in 
SEQ ID NO: 94 and its deduced amino acid sequence is provided in SEQ ID NO: 95. 

In a further specific embodiment, the subject invention discloses polypeptides 
comprising at least an immunogenic portion of an M tuberculosis antigen having one of the 
following N-terminal sequences, or a variant thereof that differs only in conservative 
1 0 substitutions and/or modifications: 

(m) Xaa-Tyr-IIe-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys-Ile- 
Asn-Val-His-Leu-Val; (SEQ ID NO: 132) or 

(n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
1 5 Pro-Gly-Gly- Arg- Arg-Xaa-Phe; (SEQ ID NO : 1 24) 

wherein Xaa may be any amino acid, preferably a cysteine residue. 

In other specific embodiments, the subject invention discloses polypeptides 
comprising at least an antigenic portion of a soluble M tuberculosis antigen (or a variant of 
such an antigen) that comprises one or more of the amino acid sequences encoded by (a) the 
20 DNA sequences of SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, (b) the complements of 
such DNA sequences, or (c) DNA sequences substantially homologous to a sequence in (a) or 
-(b). 

In further specific embodiments, the subject invention discloses polypeptides 
comprising at least an antigenic portion of a M tuberculosis antigen (or a variant of such an 
25 antigen), which may or may not be soluble, that comprises one or more of the amino acid 
sequences encoded by (a) the DNA sequences of SEQ ID NOS: 26-51, 133, 134, 158-178 and 
196, (b)the complements of such DNA sequences or (c) DNA sequences substantially 
homologous to a sequence in (a) or (b). 

In the specific embodiments discussed above, the M. tuberculosis antigens 
30 include variants that are encoded DNA sequences which are substantially homologous to one 
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or more of DNA sequences specifically recited herein. "Substantial homology," as used 
herein, refers to DNA sequences that are capable of hybridizing under moderately stringent 
conditions. Suitable moderately stringent conditions include prewashing in a solution of 5X 
SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); hybridizing at 50°C-65°C, 5X SSC, overnight or, 
5 in the event of cross-species homology, at 45°C with 0.5X SSC; followed by washing twice 
at 65°C for 20 minutes with each of 2X, 0.5X and 0.2X SSC containing 0.1% SDS). Such 
hybridizing DNA sequences are also within the scope of this invention, as are nucleotide 
sequences that, due to code degeneracy, encode an immunogenic polypeptide that is encoded 
by a hybridizing DNA sequence. 

10 In a related aspect, the present invention provides fusion proteins comprising a 

first and a second inventive polypeptide or, alternatively, a polypeptide of the present 
invention and a known M tuberculosis antigen, such as the 38 kD antigen described above or 
ESAT-6 (SEQ ID NOS: 98 and 99), together with variants of such fusion proteins. The 
fusion proteins of the present invention may also include a linker peptide between the first 

1 5 and second polypeptides. 

A DNA sequence encoding a fusion protein of the present invention is 
constructed using known recombinant DNA techniques to assemble separate DNA sequences 
encoding the first and second polypeptides into an appropriate expression vector. The 3* end 
of a DNA sequence encoding the first polypeptide is ligated, with or without a peptide linker, 

20 to the 5' end of a DNA sequence encoding the second polypeptide so that the reading frames 
of the sequences are in phase to permit mRNA translation of the two DNA sequences into a 
single fusion protein that retains the biological activity of both the first and the second 
polypeptides. 

A peptide linker sequence may be employed to separate the first and the 
25 second polypeptides by a distance sufficient to ensure that each polypeptide folds into its 
secondary and tertiary structures. Such a peptide linker sequence is incorporated into the 
fusion protein using standard techniques well known in the art. Suitable peptide linker 
sequences may be chosen based on the following factors: (1) their ability to adopt a flexible 
extended conformation; (2) their inability to adopt a secondary structure that could interact 
30 with functional epitopes on the first and second polypeptides; and (3) the lack of hydrophobic 



WO 98/16645 PCT/US97/18214 

21 

or charged residues that might react with the polypeptide functional epitopes. Preferred 
peptide linker sequences contain Gly, Asn and Ser residues. Other near neutral amino acids, 
such as Thr and Ala may also be used in the linker sequence. Amino acid sequences which 
may be usefully employed as linkers include those disclosed in Maratea et al., Gene 40:39-46, 
5 1985; Murphy et al., Proc. Natl. Acad Set USA <SJ:8258-8562, 1986; U.S. Patent 
No. 4,935,233 and U.S. Patent No. 4,751,180. The linker sequence may be from 1 to about 
50 amino acids in length. Peptide linker sequences are not required when the first and second 
polypeptides have non-essential N-terminal amino acid regions that can be used to separate 
the functional domains and prevent steric hindrance. 

10 In another aspect, the present invention provides methods for using the 

polypeptides described above to diagnose tuberculosis. In this aspect, methods are provided 
for detecting M tuberculosis infection in a biological sample, using one or more of the above 
polypeptides, alone or in combination. In embodiments in which multiple polypeptides are 
employed, polypeptides other than those specifically described herein, such as the 38 kD 

15 antigen described in Andersen and Hansen, Infect. Immun. 57:2481-2488, 1989, may be 
included. As used herein, a "biological sample" is any antibody-containing sample obtained 
from a patient. Preferably, the sample is whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid or urine. More preferably, the sample is a blood, serum or plasma sample 
obtained from a patient or a blood supply. The polypeptide(s) are used in an assay, as 

20 described below, to determine the presence or absence of antibodies to the polypeptide(s) in 
the sample, relative to a predetermined cut-off value. The presence of such antibodies 
indicates previous sensitization to mycobacterial antigens which may be indicative of 
tuberculosis. 

In embodiments in which more than one polypeptide is employed, the 
25 polypeptides used are preferably complementary (z\e., one component polypeptide will tend 
to detect infection in samples where the infection would not be detected by another 
component polypeptide). Complementary polypeptides may generally be identified by using 
each polypeptide individually to evaluate serum samples obtained from a series of patients 
known to be infected with M tuberculosis. After determining which samples test positive (as 
30 described below) with each polypeptide, combinations of two or more polypeptides may be 
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formulated that are capable of detecting infection in most, or all, of the samples tested. Such 
polypeptides are complementary. For example, approximately 25-30% of sera from 
tuberculosis-infected individuals are negative for antibodies to any single protein, such as the 
38 kD antigen mentioned above. Complementary polypeptides may, therefore, be used in 
5 combination with the 38 kD antigen to improve sensitivity of a diagnostic test. 

There are a variety of assay formats known to those of ordinary skill in the art 
for using one or more polypeptides to detect antibodies in a sample. See, e.g., Harlow and 
Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 1988, which is 
incorporated herein by reference. In a preferred embodiment, the assay involves the use of 

10 polypeptide immobilized on a solid support to bind to and remove the antibody from the 
sample. The bound antibody may then be detected using a detection reagent that contains a 
reporter group. Suitable detection reagents include antibodies that bind to the 
antibody /polypeptide complex and free polypeptide labeled with a reporter group (e.g., in a 
semi-competitive assay). Alternatively, a competitive assay may be utilized, in which an 

15 antibody that binds to the polypeptide is labeled with a reporter group and allowed to bind to 
the immobilized antigen after incubation of the antigen with the sample. The extent to which 
components of the sample inhibit the binding of the labeled antibody to the polypeptide is 
indicative of the reactivity of the sample with the immobilized polypeptide. 

The solid support may be any solid material known to those of ordinary skill 

20 in the art to which the antigen may be attached. For example, the solid support may be a test 
well in a microtiter plate or a nitrocellulose or other suitable membrane. Alternatively, the 
support may be a bead or disc ? such as glass, fiberglass, latex or a plastic material such as 
polystyrene or polyvinylchloride. The support may also be a magnetic particle or a fiber 
optic sensor, such as those disclosed, for example, in U.S. Patent No. 5,359,681. 

25 The polypeptides may be bound to the solid support using a variety of 

techniques known to those of ordinary skill in the art, which are amply described in the patent 
and scientific literature. In the context of the present invention, the term "bound" refers to 
both noncovalent association, such as adsorption, and covalent attachment (which may be a 
direct linkage between the antigen and functional groups on the support or may be a linkage 

30 by way of a cross-linking agent). Binding by adsorption to a well in a microtiter plate or to a 
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membrane is preferred. In such cases, adsorption may be achieved by contacting the 
polypeptide, in a suitable buffer, with the solid support for a suitable amount of time. The 
contact time varies with temperature, but is typically between about 1 hour and 1 day. In 
general, contacting a well of a plastic microtiter plate (such as polystyrene or 
5 polyvinylchloride) with an amount of polypeptide ranging from about 10 ng to about 1 ]ig, 
and preferably about 1 00 ng, is sufficient to bind an adequate amount of antigen. 

Covalent attachment of polypeptide to a solid support may generally be 
achieved by first reacting the support with a Afunctional reagent that will react with both the 
support and a functional group, such as a hydroxyl or amino group, on the polypeptide. For 

10 example, the polypeptide may be bound to supports having an appropriate polymer coating 
using benzoquinone or by condensation of an aldehyde group on the support with an amine 
and an active hydrogen on the polypeptide (see, e.g., Pierce Immunotechnology Catalog and 
Handbook, 1991, at A12-A13). 

In certain embodiments, the assay is an enzyme linked immunosorbent assay 

15 (ELISA). This assay may be performed by first contacting a polypeptide antigen that has 
been immobilized on a solid support, commonly the well of a microtiter plate, with the 
sample, such that antibodies to the polypeptide within the sample are allowed to bind to the 
immobilized polypeptide. Unbound sample is then removed from the immobilized 
polypeptide and a detection reagent capable of binding to the immobilized antibody- 

20 polypeptide complex is added. The amount of detection reagent that remains bound to the 
solid support is then determined using a method appropriate for the specific detection reagent. 

More specifically, once the polypeptide is immobilized on the support as 
described above, the remaining protein binding sites on the support are typically blocked. 
Any suitable blocking agent known to those of ordinary skill in the art, such as bovine serum 

25 albumin or Tween 20™ (Sigma Chemical Co., St. Louis, MO) may be employed. The 
immobilized polypeptide is then incubated with the sample, and antibody is allowed to bind 
to the antigen. The sample may be diluted with a suitable diluent, such as phosphate-buffered 
saline (PBS) prior to incubation. In general, an appropriate contact time (i.e., incubation 
time) is that period of time that is sufficient to detect the presence of antibody within a 

30 M. tuberculosis-infected sample. Preferably, the contact time is sufficient to achieve a level 
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of binding that is at least 95% of that achieved at equilibrium between bound and unbound 
antibody. Those of ordinary skill in the art will recognize that the time necessary to achieve 
equilibrium may be readily determined by assaying the level of binding that occurs over a 
period of time. At room temperature, an incubation time of about 30 minutes is generally 
5 sufficient. 

Unbound sample may then be removed by washing the solid support with an 
appropriate buffer, such as PBS containing 0.1% Tween 20™. Detection reagent may then be 
added to the solid support. An appropriate detection reagent is any compound that binds to 
the immobilized antibody-polypeptide complex and that can be detected by any of a variety 

10 of means known to those in the art. Preferably, the detection reagent contains a binding agent 
(such as, for example, Protein A, Protein G, immunoglobulin, lectin or free antigen) 
conjugated to a reporter group. Preferred reporter groups include enzymes (such as 
horseradish peroxidase), substrates, cofactors, inhibitors, dyes, radionuclides, luminescent 
groups, fluorescent groups and biotin. The conjugation of binding agent to reporter group 

15 may be achieved using standard methods known to those of ordinary skill in the art. 
Common binding agents may also be purchased conjugated to a variety of reporter groups 
from many commercial sources (e.g., Zymed Laboratories, San Francisco, CA, and Pierce, 
Rockford, IL). 

The detection reagent is then incubated with the immobilized antibody- 
20 polypeptide complex for an amount of time sufficient to detect the bound antibody. An 
appropriate amount of time may generally be determined from the manufacturer's instructions 
or by assaying the level of binding that occurs over a period of time. Unbound detection 
reagent is then removed and bound detection reagent is detected using the reporter group. 
The method employed for detecting the reporter group depends upon the nature of the 
25 reporter group. For radioactive groups, scintillation counting or autoradiographic methods 
are generally appropriate. Spectroscopic methods may be used to detect dyes, luminescent 
groups and fluorescent groups. Biotin may be detected using avidin, coupled to a different 
reporter group (commonly a radioactive or fluorescent group or an enzyme). Enzyme 
reporter groups may generally be detected by the addition of substrate (generally for a 
30 specific period of time), followed by spectroscopic or other analysis of the reaction products. 
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To determine the presence or absence of anti-M tuberculosis antibodies in the 
sample, the signal detected from the reporter group that remains bound to the solid support is 
generally compared to a signal that corresponds to a predetermined cut-off value. In one 
preferred embodiment, the cut-off value is the average mean signal obtained when the 
5 immobilized antigen is incubated with samples from an uninfected patient. In general, a 
sample generating a signal that is three standard deviations above the predetermined cut-off 
value is considered positive for tuberculosis. In an alternate preferred embodiment, the cut- 
off value is determined using a Receiver Operator Curve, according to the method of Sackett 
et al., Clinical Epidemiology: A Basic Science for Clinical Medicine, Little Brown and Co., 

10 1985, pp. 106-107. Briefly, in this embodiment, the cut-off value may be determined from a 
plot of pairs of true positive rates (i.e., sensitivity) and false positive rates (100%-specificity) 
that correspond to each possible cut-off value for the diagnostic test result. The cut-off value 
on the plot that is the closest to the upper left-hand corner (i.e., the value that encloses the 
largest area) is the most accurate cut-off value, and a sample generating a signal that is higher 

15 than the cut-off value determined by this method may be considered positive. Alternatively, 
the cut-off value may be shifted to the left along the plot, to minimize the false positive rate, 
or to the right, to minimize the false negative rate. In general, a sample generating a signal 
that is higher than the cut-off value determined by this method is considered positive for 
tuberculosis. 

20 In a related embodiment, the assay is performed in a rapid flow-through or 

strip test format, wherein the antigen is immobilized on a membrane, such as nitrocellulose. 
In the flow-through test, antibodies within the sample bind to the immobilized polypeptide as 
the sample passes through the membrane. A detection reagent (e.g., protein A-colIoidal gold) 
then binds to the antibody-polypeptide complex as the solution containing the detection 

25 reagent flows through the membrane. The detection of bound detection reagent may then be 
performed as described above. In the strip test format, one end of the membrane to which 
polypeptide is bound is immersed in a solution containing the sample. The sample migrates 
along the membrane through a region containing detection reagent and to the area of 
immobilized polypeptide. Concentration of detection reagent at the polypeptide indicates the 

30 presence of anti-M tuberculosis antibodies in the sample. Typically, the concentration of 
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detection reagent at that site generates a pattern, such as a line, that can be read visually. The 
absence of such a pattern indicates a negative result. In general, the amount of polypeptide 
immobilized on the membrane is selected to generate a visually discernible pattern when the 
biological sample contains a level of antibodies that would be sufficient to generate a positive 
signal in an ELISA, as discussed above. Preferably, the amount of polypeptide immobilized 
on the membrane ranges from about 25 ng to about 1 ^ig, and more preferably from about 
50 ng to about 500 ng. Such tests can typically be performed with a very small amount (e.g., 
one drop) of patient serum or blood. 

Of course, numerous other assay protocols exist that are suitable for use with 
the polypeptides of the present invention. The above descriptions are intended to be 
exemplary only. 

In yet another aspect, the present invention provides antibodies to the 
inventive polypeptides. Antibodies may be prepared by any of a variety of techniques known 
to those of ordinary skill in the art. See, e.g., Harlow and Lane, Antibodies: A Laboratory 
Manual, Cold Spring Harbor Laboratory, 1988. In one such technique, an immunogen 
comprising the antigenic polypeptide is initially injected into any of a wide variety of 
mammals (e.g., mice, rats, rabbits, sheep and goats). In this step, the polypeptides of this 
invention may serve as the immunogen without modification. Alternatively, particularly for 
relatively short polypeptides, a superior immune response may be elicited if the polypeptide 
is joined to a carrier protein, such as bovine serum albumin or keyhole limpet hemocyanin. 
The immunogen is injected into the animal host, preferably according to a predetermined 
schedule incorporating one or more booster immunizations, and the animals are bled 
periodically. Polyclonal antibodies specific for the polypeptide may then be purified from 
such antisera by, for example, affinity chromatography using the polypeptide coupled to a 
suitable solid support. 

Monoclonal antibodies specific for the antigenic polypeptide of interest may 
be prepared, for example, using the technique of Kohler and Milstein, Eur. J. Immunol 
6:511-519, 1976, and improvements thereto. Briefly, these methods involve the preparation 
of immortal cell lines capable of producing antibodies having the desired specificity (i.e., 
reactivity with the polypeptide of interest). Such cell lines may be produced, for example, 
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from spleen cells obtained from an animal immunized as described above. The spleen cells 
are then immortalized by, for example, fusion with a myeloma cell fusion partner, preferably 
one that is syngeneic with the immunized animal. A variety of fusion techniques may be 
employed. For example, the spleen cells and myeloma cells may be combined with a 
5 nonionic detergent for a few minutes and then plated at low density on a selective medium 
that supports the growth of hybrid cells, but not myeloma cells. A preferred selection 
technique uses HAT (hypoxanthine, aminopterin, thymidine) selection. After a sufficient 
time, usually about 1 to 2 weeks, colonies of hybrids are observed. Single colonies are 
selected and tested for binding activity against the polypeptide. Hybridomas having high 

10 reactivity and specificity are preferred. 

Monoclonal antibodies may be isolated from the supernatants of growing 
hybridoma colonies. In addition, various techniques may be employed to enhance the yield, 
such as injection of the hybridoma cell line into the peritoneal cavity of a suitable vertebrate 
host, such as a mouse. Monoclonal antibodies may then be harvested from the ascites fluid or 

15 the blood. Contaminants may be removed from the antibodies by conventional techniques, 
such as chromatography, gel filtration, precipitation, and extraction. The polypeptides of this 
invention may be used in the purification process in, for example, an affinity chromatography 
step. 

Antibodies may be used in diagnostic tests to detect the presence of 
20 M. tuberculosis antigens using assays similar to those detailed above and other techniques 
well known to those of skill in the art, thereby providing a method for detecting 
M. tuberculosis infection in a patient. 

Diagnostic reagents of the present invention may also comprise DNA 
sequences encoding one or more of the above polypeptides, or one or more portions thereof. 
25 For example, at least two oligonucleotide primers may be employed in a polymerase chain 
reaction (PCR) based assay to amplify M tuberculosis-specific cDNA derived from a 
biological sample, wherein at least one of the oligonucleotide primers is specific for a DNA 
molecule encoding a polypeptide of the present invention. The presence of the amplified 
cDNA is then detected using techniques well known in the art, such as gel electrophoresis. 
30 Similarly, oligonucleotide probes specific for a DNA molecule encoding a polypeptide of the 
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present invention may be used in a hybridization assay to detect the presence of an inventive 
polypeptide in a biological sample. 

As used herein, the term "oligonucleotide primer/probe specific for a DNA 
molecule" means an oligonucleotide sequence that has at least about 80%, preferably at least 

5 about 90% and more preferably at least about 95%, identity to the DNA molecule in question. 
Oligonucleotide primers and/or probes which may be usefully employed in the inventive 
diagnostic methods preferably have at least about 10-40 nucleotides. In a preferred 
embodiment, the oligonucleotide primers comprise at least about 10 contiguous nucleotides 
of a DNA molecule encoding one of the polypeptides disclosed herein. Preferably, 

10 oligonucleotide probes for use in the inventive diagnostic methods comprise at least about 15 
contiguous oligonucleotides of a DNA molecule encoding one of the polypeptides disclosed 
herein. Techniques for both PCR based assays and hybridization assays are well known in 
the art (see, for example, Mullis et al. Ibid; Ehrlich, Ibid). Primers or probes may thus be 
used to detect M. tuberculosis-specific sequences in biological samples. DNA probes or 

15 primers comprising oligonucleotide sequences described above may be used alone, in 
combination with each other, or with previously identified sequences, such as the 38 kD 
antigen discussed above. 

The following Examples are offered by way of illustration and not by way of 

20 limitation. 

EXAMPLES 
EXAMPLE 1 

25 Purification and Characterization of Pol ypeptides 

from m tuberculosis culture filtrate 



This example illustrates the preparation of M. tuberculosis soluble 
polypeptides from culture filtrate. Unless otherwise noted, all percentages in the following 
30 example are weight per volume. 
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M tuberculosis (either H37Ra, ATCC No. 25177, or H37Rv, ATCC 
No. 25618) was cultured in sterile GAS media at 37°C for fourteen days. The media was 
then vacuum filtered (leaving the bulk of the cells) through a 0.45 ]X filter into a sterile 2.5 L 
bottle. The media was then filtered through a 0.2 jj. filter into a sterile 4 L bottle. NaN 3 was 
5 then added to the culture filtrate to a concentration of 0.04%. The bottles were then placed in 
a 4°C cold room. 

The culture filtrate was concentrated by placing the filtrate in a 12 L reservoir 
that had been autoclaved and feeding the filtrate into a 400 ml Amicon stir cell which had 
been rinsed with ethanol and contained a 10,000 kDa MWCO membrane. The pressure was 
10 maintained at 60 psi using nitrogen gas. This procedure reduced the 12 L volume to 
approximately 50 ml. 

The culture filtrate was then dialyzed into 0.1% ammonium bicarbonate using 
a 8,000 kDa MWCO cellulose ester membrane, with two changes of ammonium bicarbonate 
solution. Protein concentration was then determined by a commercially available BCA assay 
. 15 (Pierce, Rockford, IL). 

The dialyzed culture filtrate was then lyophilized, and the polypeptides 
resuspended in distilled water. The polypeptides were then dialyzed against 0.01 mM 1,3 
bis[tris(hydroxymethyl)-methylamino]propane, pH 7.5 (Bis-Tris propane buffer), the initial 
conditions for anion exchange chromatography. Fractionation was performed using gel 
20 profusion chromatography on a POROS 146 II Q/M anion exchange column 4.6 mm x 
100 mm (Perseptive BioSystems, Framingham, MA) equilibrated in 0.01 mM Bis-Tris 
propane buffer pH 7.5. Polypeptides were eluted with a linear 0-0.5 M NaCl gradient in the 
above buffer system. The column eluent was monitored at a wavelength of 220 nm. 

The pools of polypeptides eluting from the ion exchange column were 
25 dialyzed against distilled water and lyophilized. The resulting material was dissolved in 0.1% 
trifluoroacetic acid (TFA) pH 1.9 in water, and the polypeptides were purified on a Deita-Pak 
CI 8 column (Waters, Milford, MA) 300 Angstrom pore size, 5 micron particle size (3.9 x 
150 mm). The polypeptides were eluted from the column with a linear gradient from 0-60% 
dilution buffer (0.1% TFA in acetonitrile). The flow rate was 0.75 ml/minute and the HPLC 
30 eluent was monitored at 214 nm. Fractions containing the eluted polypeptides were collected 
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to maximize the purity of the individual samples. Approximately 200 purified polypeptides 
were obtained. 

The purified polypeptides were then screened for the ability to induce T-cell 
proliferation in PBMC preparations. The PBMCs from donors known to be PPD skin test 

5 positive and whose T cells were shown to proliferate in response to PPD and crude soluble 
proteins from MTB were cultured in medium comprising RPMI 1640 supplemented with 
10% pooled human serum and 50 ug/ml gentamicin. Purified polypeptides were added in 
duplicate at concentrations of 0.5 to 10 ug/mL. After six days of culture in 96-well round- 
bottom plates in a volume of 200 ul, 50 ul of medium was removed from each well for 

10 determination of IFN-y levels, as described below. The plates were then pulsed with 
1 u.Ci/well of tritiated thymidine for a further 18 hours, harvested and tritium uptake 
determined using a gas scintillation counter. Fractions that resulted in proliferation in both 
replicates three fold greater than the proliferation observed in cells cultured in medium alone 

were considered positive. 
15 IFN-y was measured using an enzyme-linked immunosorbent assay (ELISA). 

ELISA plates were coated with a mouse monoclonal antibody directed to human IFN-y 
(Chemicon) in PBS for four hours at room temperature. Wells were then blocked with PBS 
containing 5% (W/V) non-fat dried milk for 1 hour at room temperature. The plates were 
then washed six times in PBS/0.2% TWEEN-20 and samples diluted 1:2 in culture medium 
20 in the ELISA plates were incubated overnight at room temperature. The plates were again 
washed and a polyclonal rabbit anti-human IFN-y serum diluted 1:3000 in PBS/10% normal 
goat serum was added to each well. The plates were then incubated for two hours at room 
temperature, washed and horseradish peroxidase-coupled anti-rabbit IgG (Jackson Labs.) was 
added at a 1 :2000 dilution in PBS/5% non-fat dried milk. After a further two hour incubation 
25 at room temperature, the plates were washed and TMB substrate added. The reaction was 
stopped after 20 min with 1 N sulfuric acid. Optical density was determined at 450 nm using 
570 nm as a reference wavelength. Fractions that resulted in both replicates giving an OD 
two fold greater than the mean OD from cells cultured in medium alone, plus 3 standard 
deviations, were considered positive. 
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For sequencing, the polypeptides were individually dried onto 
Biobrene™ (Perkin Elmer/ Applied BioSystems Division, Foster City, CA) treated glass fiber 
filters. The filters with polypeptide were loaded onto a Perkin Elmer/Applied BioSystems 
Division Precise 492 protein sequencer. The polypeptides were sequenced from the amino 
terminal and using traditional Edman chemistry. The amino acid sequence was determined 
for each polypeptide by comparing the retention time of the PTH amino acid derivative to the 
appropriate PTH derivative standards. 

Using the procedure described above, antigens having the following 
N-terminal sequences were isolated: 

(a) Asp-Pro- Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Xaa-Asn-Tyr-Gly-Gln- 
Val-Val-Ala-Ala-Leu (SEQ ID NO: 54); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQ ID NO: 55); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg (SEQ ID NO: 56); * 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 
(SEQ ID NO: 57); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-AIa-Val (SEQ ID 
NO: 58); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 59); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Ala-Ala-Ala-Ala-Pro-Pro- 
Ala (SEQ ID NO: 60); and 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 
(SEQ ID NO: 61); 

wherein Xaa may be any amino acid. 

An additional antigen was isolated employing a microbore HPLC purification 
step in addition to the procedure described above. Specifically, 20 (il of a fraction comprising 
a mixture of antigens from the chromatographic purification step previously described, was 
purified on an Aquapore CI 8 column (Perkin Elmer/ Applied Biosystems Division, Foster 
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City, CA) with a 7 micron pore size, column size 1 mm x 100 mm, in a Perkin Elmer/Applied 
Biosystems Division Model 172 HPLC. Fractions were eluted from the column with a linear 
gradient of 1%/minute of acetonitrile (containing 0.05% TFA) in water (0.05% TFA) at a 
flow rate of 80 ul/minute. The eluent was monitored at 250 nm. The original fraction was 
5 separated into 4 major peaks plus other smaller components and a polypeptide was obtained 
which was shown to have a molecular weight of 12.054 Kd (by mass spectrometry) and the 

following N-terminal sequence: 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Gln-Thr-Ser- 

Leu-Leu-Asn-Asn-Leu-Ala-Asp-Pro-Asp-Val-Ser-Phe-Ala-Asp (SEQ 

10 ID NO: 62). 

This polypeptide was shown to induce proliferation and IFN-y production in PBMC 
preparations using the assays described above. 

Additional soluble antigens were isolated from M. tuberculosis culture filtrate 
as follows. M. tuberculosis culture filtrate was prepared as described above. Following 

15 dialysis against Bis-Tris propane buffer, at pH 5.5, fractionation was performed using anion 
exchange chromatography on a Poros QE column 4.6 x 100 mm (Perseptive Biosystems) 
equilibrated in Bis-Tris propane buffer pH 5.5. Polypeptides were eluted with a linear 0-1.5 
M NaCl gradient in the above buffer system at a flow rate of 10 ml/min. The column eluent 
was monitored at a wavelength of 214 nm. 

20 The fractions eluting from the ion exchange column were pooled and 

subjected to reverse phase chromatography using a Poros R2 column 4.6 x 100 mm 
(Perseptive Biosystems). Polypeptides were eluted from the column with a linear gradient 
from 0-100% acetonitrile (0.1% TFA) at a flow rate of 5 ml/min. The eluent was monitored 
at 214 nm. 

25 Fractions containing the eluted polypeptides were lyophilized and resuspended 

in 80 of aqueous 0.1% TFA and further subjected to reverse phase chromatography on a 
Vydac C4 column 4.6 x 150 mm (Western Analytical, Temecula, CA) with a linear gradient 
of 0-100% acetonitrile (0.1% TFA) at a flow rate of 2 ml/min. Eluent was monitored at 214 
nm. 
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The fraction with biological activity was separated into one major peak plus 
other smaller components. Western blot of this peak onto PVDF membrane revealed three 
major bands of molecular weights 14 Kd, 20 Kd and 26 Kd. These polypeptides were 
determined to have the following N-terrninal sequences, respectively: 
5 G) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala-Ser; 

(SEQ ID NO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala-Asp; 

(SEQ ID NO: 130) and 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 
1 0 (SEQ ID NO: 131), wherein Xaa may be any amino acid. 

Using the assays described above, these polypeptides were shown to induce proliferation and 
IFN-y production in PBMC preparations. Figs. 1 A and B show the results of such assays 
using PBMC preparations from a first and a second donor, respectively. 

DNA sequences that encode the antigens designated as (a), (c), (d) and (g) 
15 above were obtained by screening a M. tuberculosis genomic library using 32 P end labeled 
degenerate oligonucleotides corresponding to the N-terminal sequence and containing 
M. tuberculosis codon bias. The screen performed using a probe corresponding to antigen (a) 
above identified a clone having the sequence provided in SEQ ID NO: 96. The polypeptide 
encoded by SEQ ID NO: 96 is provided in SEQ ID NO: 97. The screen performed using a 
20 probe corresponding to antigen (g) above identified a clone having the sequence provided in 
SEQ ID NO: 52. The polypeptide encoded by SEQ ID NO: 52 is provided in SEQ ID 
NO: 53. The screen performed using a probe corresponding to antigen (d) above identified a 
clone having the sequence provided in SEQ ID NO: 24, and the screen performed with a 
probe corresponding to antigen (c) identified a clone having the sequence provided in SEQ ID 
25 NO: 25. 

The above amino acid sequences were compared to known amino acid 
sequences in the gene bank using the DNA STAR system. The database searched contains 
some 1 73,000 proteins and is a combination of the Swiss, PIR databases along with translated 
protein sequences (Version 87). No significant homologies to the amino acid sequences for 
30 antigens (a)-(h) and (1) were detected. 
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The amino acid sequence for antigen (i) was found to be homologous to a 
sequence from M. leprae. The full length M. leprae sequence was amplified from genomic 
DNA using the sequence obtained from GENBANK. This sequence was then used to screen 
an M. tuberculosis library and a full length copy of the M. tuberculosis homologue was 

5 obtained (SEQ ID NO: 94). 

The amino acid sequence for antigen (j) was found to be homologous to a 
known M. tuberculosis protein translated from a DNA sequence. To the best of the 
inventors' knowledge, this protein has not been previously shown to possess T-cell 
stimulatory activity. The amino acid sequence for antigen (k) was found to be related to a 

10 sequence from M leprae. 

In the proliferation and IFN-y assays described above, using three PPD 
positive donors, the results for representative antigens provided above are presented in Table 
1: 



15 



TABLE 1 

Results of PBMC Proliferation and IFN-y Assays 



Sequence 


Proliferation 


IFN-y 


(a) 


+ 




(c) 


+++ 


-H~f 


(d) 


++ 


+4- 


(g) 


+++ 


+++ 


(h) 


+++ 


+++ 



In Table 1, responses that gave a stimulation index (SI) of between 2 and 4 
20 (compared to cells cultured in medium alone) were scored as +, as SI of 4-8 or 2-4 at a 
concentration of 1 ug or less was scored as ++ and an SI of greater than 8 was scored as +++. 
The antigen of sequence (i) was found to have a high SI (+++) for one donor and lower SI 
(++ and +) for the two other donors in both proliferation and IFN-y assays. These results 
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indicate that these antigens are capable of inducing proliferation and/or interferon-y 
production. 

EXAMPLE 2 

5 Use Of Patient Sera To Isolate M Tuberculosis Antigens 

This example illustrates the isolation of antigens from M. tuberculosis lysate 
by screening with serum from M. tuberculosis-infected individuals. 

Dessicated M tuberculosis H37Ra (Difco Laboratories) was added to a 2% 
10 NP40 solution, and alternately homogenized and sonicated three times. The resulting 
suspension was centrifuged at 13,000 rpm in microfuge tubes and the supernatant put through 
a 0.2 micron syringe filter. The filtrate was bound to Macro Prep DEAE beads (BioRad, 
Hercules, CA). The beads were extensively washed with 20 mM Tris pH 7.5 and bound 
proteins eluted with 1M NaCl. The NaCl elute was dialyzed overnight against 10 mM Tris, 
15 pH 7.5. Dialyzed solution was treated with DNase and RNase at 0.05 mg/ml for 30 min. at 
room temperature and then with a-D-mannosidase, 0.5 U/mg at pH 4.5 for 3-4 hours at room 
temperature. After returning to pH 7.5, the material was fractionated via FPLC over a Bio 
Scale-Q-20 column (BioRad). Fractions were combined into nine pools, concentrated in a 
Centriprep 10 (Amicon, Beverley, MA) and screened by Western blot for serological activity 
20 using a serum pool from M tuberculosis-infected patients which was not immunoreactive 
with other antigens of the present invention. 

The most reactive fraction was run in SDS-PAGE and transferred to PVDF. A 
band at approximately 85 Kd was cut out yielding the sequence: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys-Ile- 
25 Asn-Val-His-Leu-Val; (SEQ ID NO: 132), wherein Xaa may be any 

amino acid. 

Comparison of this sequence with those in the gene bank as described above, 
revealed no significant homologies to known sequences. 

A DNA sequence that encodes the antigen designated as (m) above was 
30 obtained by screening a genomic M. tuberculosis Erdman strain library using labeled 
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degenerate oligonucleotides corresponding to the N-terminal sequence of SEQ ID NO: 137. A 
clone was identified having the DNA sequence provided in SEQ ID NO: 198. This sequence 
was found to encode the amino acid sequence provided in SEQ ID NO: 199. Comparison of 
these sequences with those in the genebank revealed some similarity to sequences previously 
5 identified in M. tuberculosis and M. bovis. 

EXAMPLE 3 

Preparation of DNA Shouences Encoding M . tuberculosis Antigens 

10 This example illustrates the preparation of DNA sequences encoding 

M. tuberculosis antigens by screening a M. tuberculosis expression library with sera obtained 
from patients infected with M. tuberculosis, or with anti-sera raised against M. tuberculosis 
antigens. 

15 A . Preparation of M. tuberculosis Sol u ble Antigens using Rabbit Anti-sera 
Raised against M. tuberculo sa Supernatant 

Genomic DNA was isolated from the M. tuberculosis strain H37Ra. The DNA 
was randomly sheared and used to construct an expression library using the Lambda ZAP 
expression system (Stratagene, La Jolla, CA). Rabbit anti-sera was generated against 
20 secretory proteins of the M. tuberculosis strains H37Ra, H37Rv and Erdman by immunizing a 
rabbit with concentrated supernatant of the M. tuberculosis cultures. Specifically, the rabbit 
was first immunized subcutaneously with 200 ug of protein antigen in a total volume of 2 ml 
containing 100 ug muramyl dipeptide (Calbiochem, La Jolla, CA) and 1 ml of incomplete 
Freund's adjuvant. Four weeks later the rabbit was boosted subcutaneously with 100 ug 
25 antigen in incomplete Freund's adjuvant. Finally, the rabbit was immunized intravenously 
four weeks later with 50 p.g protein antigen. The anti-sera were used to screen the expression 
library as described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989. Bacteriophage plaques 
expressing immunoreactive antigens were purified. Phagemid from the plaques was rescued 
30 and the nucleotide sequences of the M. tuberculosis clones deduced. 
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Thirty two clones were purified. Of these, 25 represent sequences that have 
not been previously identified in M, tuberculosis. Proteins were induced by IPTG and 
purified by gel elution, as described in Skeiky etal., J. Exp. Med. 757:1527-1537, 1995. 
Representative partial sequences of DNA molecules identified in this screen are provided in 
5 SEQ ID NOS: 1-25. The corresponding predicted amino acid sequences Eire shown in SEQ 
ID NOS: 64-88. 

On comparison of these sequences with known sequences in the gene bank 
using the databases described above, it was found that the clones referred to hereinafter as 
TbRA2A, TbRA16, TbRA18, and TbRA29 (SEQ ID NOS: 77, 69, 71, 76) show some 

10 homology to sequences previously identified in Mycobacterium leprae but not in 
M tuberculosis. TbRAll, TbRA26, TbRA28 and TbDPEP (SEQ ID NOS: 66, 74, 75, 53) 
have been previously identified in M tuberculosis. No significant homologies were found to 
TbRAl, TbRA3, TbRA4, TbRA9, TbRAlO, TbRA13, TbRA17, TbRA19, TbRA29, 
TbRA32, TbRA36 and the overlapping clones TbRA35 and TbRAl 2 (SEQ ID NOS: 64, 78, 

15 82, 83, 65, 68, 76, 72, 76, 79, 81, 80, 67, respectively). The clone TbRa24 is overlapping 
with clone TbRa29. 

B. Use of Sera from Patients having Pulmonary or Pleural Tuberculosis to 
Identify DNA Sequences Encoding M. tuberculosis Antigens 

20 The genomic DNA library described above, and an additional H37Rv library, 

were screened using pools of sera obtained from patients with active tuberculosis. To prepare 
-the~H37Rv library, M. tuberculosis strain H37Rv genomic DNA was isolated, subjected to 
partial Sau3A digestion and used to construct an expression library using the Lambda Zap 
expression system (Stratagene, La Jolla, Ca). Three different pools of sera, each containing 

25 sera obtained from three individuals with active pulmonary or pleural disease, were used in 
the expression screening. The pools were designated TbL, TbM and TbH, referring to 
relative reactivity with H37Ra lysate {i.e., TbL = low reactivity, TbM = medium reactivity 
and TbH = high reactivity) in both ELISA and immunoblot format. A fourth pool of sera 
from seven patients with active pulmonary tuberculosis was also employed. All of the sera 
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lacked increased reactivity with the recombinant 38 kD M tuberculosis H37Ra phosphate- 
binding protein. 

All pools were pre-adsorbed with E. coli lysate and used to screen the H37Ra 
and H37Rv expression libraries, as described in Sambrook et al., Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989. 
Bacteriophage plaques expressing immunoreactive antigens were purified. Phagemid from 
the plaques was rescued and the nucleotide sequences of the M. tuberculosis clones deduced. 

Thirty two clones were purified. Of these, 3 1 represented sequences that had 
not been previously identified in human M. tuberculosis. Representative sequences of the 
DNA molecules identified are provided in SEQ ID NOS:: 26-51 and 100. Of these, TbH-8-2 
(SEQ. ID NO. 100) is a partial clone of TbH-8, and TbH-4 (SEQ. ID NO. 43) and TbH-4- 
FWD (SEQ. ID NO. 44) are non-contiguous sequences from the same clone. Amino acid 
sequences for the antigens hereinafter identified as Tb38-1, TbH-4, TbH-8, TbH-9, and 
TbH-12 are shown in SEQ ID NOS.: 89-93. Comparison of these sequences with known 
15 sequences in the gene bank using the databases identified above revealed no significant 
homologies to TbH-4, TbH-8, TbH-9 and TbM-3, although weak homologies were found to 
TbH-9. TbH-12 was found to be homologous to a 34 kD antigenic protein previously 
identified in M. paratuberculosis (Acc. No. S28515). Tb38-1 was found to be located 34 
base pairs upstream of the open reading frame for the antigen ESAT-6 previously identified 
20 in M. bovis (Acc. No. U34848) and in M. tuberculosis (Sorensen et al., Infec. Immun. 

65:1710-1717, 1995). 

Probes derived from Tb38-1 and TbH-9, both isolated from an H37Ra library, 
were used to identify clones in an H37Rv library. Tb38-1 hybridized to Tb38-1F2, Tb38- 
1F3, Tb38-1F5 and Tb38-1F6 (SEQ. ID NOS: 107, 108, 1 1 1, 1 13, and 1 14). (SEQ ID NOS: 

25 1 07 and 1 08 are non-contiguous sequences from clone Tb38- 1 F2.) Two open reading frames 
were deduced in Tb38-IF2; one corresponds to Tb37FL (SEQ. ID. NO. 109), the second, a 
partial sequence, may be the homologue of Tb38-1 and is called Tb38-IN (SEQ. ID NO. 1 10). 
The deduced amino acid sequence of Tb38-1F3 is presented in SEQ. ID. NO. 1 12. A TbH-9 
probe identified three clones in the H37Rv library: TbH-9-FL (SEQ. ID NO. 101), which 

30 may be the homologue of TbH-9 (R37Ra), TbH-9- 1 (SEQ. ID NO. 103), and TbH-8-2 (SEQ. 
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ID NO. 105) is a partial clone of TbH-8. The deduced amino acid sequences for these three 
clones are presented in SEQ ID NOS: 102, 104 and 106. 

Further screening of the M. tuberculosis genomic DNA library, as described 
above, resulted in the recovery of ten additional reactive clones, representing seven different 
5 genes. One of these genes was identified as the 38 Kd antigen discussed above, one was 
determined to be identical to the 14Kd alpha crystallin heat shock protein previously shown 
to be present in M, tuberculosis, and a third was determined to be identical to the antigen 
TbH-8 described above. The determined DNA sequences for the remaining five clones 
(hereinafter referred to as TbH-29, TbH-30, TbH-32 and TbH-33) are provided in SEQ ID 

10 NO: 133-136, respectively, with the corresponding predicted amino acid sequences being 
provided in SEQ ID NO: 137-140, respectively. The DNA and amino acid sequences for 
these antigens were compared with those in the gene bank as described above. No 
homologies were found to the 5' end of TbH-29 (which contains the reactive open reading 
frame), although the 3' end of TbH-29 was found to be identical to the M. tuberculosis 

15 cosmid Y227. TbH-32 and TbH-33 were found to be identical to the previously identified 
M. tuberculosis insertion element IS6110 and to the M tuberculosis cosmid Y50, 
respectively. No significant homologies to TbH-30 were found. 

Positive phagemid from this additional screening were used to infect E. coli 
XL-1 Blue MRF', as described in Sambrook et al., supra. Induction of recombinant protein 

20 was accomplished by the addition of IPTG. Induced and uninduced lysates were run in 
duplicate on SDS-PAGE and transferred to nitrocellulose filters. Filters were reacted with 
-human M tuberculosis sera (1:200 dilution) reactive with TbH and a rabbit sera (1:200 or 
1:250 dilution) reactive with the N-terminal 4 Kd portion of lacZ. Sera incubations were 
performed for 2 hours at room temperature. Bound antibody was detected by addition of I25 I- 

25 labeled Protein A and subsequent exposure to film for variable times ranging from 1 6 hours 
to 1 1 days. The results of the immunoblots are summarized in Table 2. 
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Antigen 



Human M. tb Anti-lacZ 
Sera Sera 



TbH-29 45 Kd 45 Kd 

TbH-30 No reactivity 29 Kd 

TbH-32 12 Kd 12 Kd 

TbH-33 16 Kd 16Kd 



Positive reaction of the recombinant human M. tuberculosis antigens with both 
the human M. tuberculosis sera and anti-lacZ sera indicate that reactivity of the human M. 
tuberculosis sera is directed towards the fusion protein. Antigens reactive with the anti-lacZ 
sera but not with the human M. tuberculosis sera may be the result of the human M. 
15 tuberculosis sera recognizing conformational epitopes, or the antigen-antibody binding 
kinetics may be such that the 2 hour sera exposure in the immunoblot is not sufficient. 

Studies were undertaken to determine whether the antigens TbH-9 and Tb38-1 
represent cellular proteins or are secreted into M tuberculosis culture media. In the first 
study, rabbit sera were raised against A) secretory proteins of M tuberculosis, B) the known 
20 secretory recombinant M. tuberculosis antigen 85b, C) recombinant Tb38-1 and D) 
recombinant TbH-9, using protocols substantially as described in Example 3A. Total M 
tuberculosis lysate, concentrated supernatant of M. tuberculosis cultures and the recombinant 
antigens 85b, TbH-9 and Tb38-1 were resolved on denaturing gels, immobilized on 
nitrocellulose membranes and duplicate blots were probed using the rabbit sera described 
25 above. 

The results of this analysis using control sera (panel I) and antisera (panel II) 
against secretory proteins, recombinant 85b, recombinant Tb38-1 and recombinant TbH-9 are 
shown in Figures 2A-D, respectively, wherein the lane designations are as follows: 1) 
molecular weight protein standards; 2) 5 ug of M. tuberculosis lysate; 3) 5 ng secretory 
30 proteins; 4) 50 ng recombinant Tb38-1; 5) 50 ng recombinant TbH-9; and 6) 50 ng 
recombinant 85b. The recombinant antigens were engineered with six terminal histidine 
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residues and would therefore be expected to migrate with a mobility approximately 1 kD 
larger that the native protein. In Figure 2D, recombinant TbH-9 is lacking approximately 10 
kD of the full-length 42 kD antigen, hence the significant difference in the size of the 
immunoreactive native TbH-9 antigen in the lysate lane (indicated by an arrow). These 
5 results demonstrate that Tb38-1 and TbH-9 are intracellular antigens and are not actively 
secreted by M. tuberculosis. 

The finding that TbH-9 is an intracellular antigen was confirmed by 
determining the reactivity of TbH-9-specific human T cell clones to recombinant TbH-9, 
secretory M. tuberculosis proteins and PPD. A TbH-9-specific T cell clone (designated 

10 131TbH-9) was generated from PBMC of a healthy PPD-positive donor. The proliferative 
response of 131TbH-9 to secretory proteins, recombinant TbH-9 and a control M. 
tuberculosis antigen, TbRal 1, was determined by measuring uptake of tritiated thymidine, as 
described in Example 1. As shown in Figure 3 A, the clone 131TbH-9 responds specifically 
to TbH-9, showing that TbH-9 is not a significant component of M tuberculosis secretory 

15 proteins. Figure 3B shows the production of IFN-y by a second TbH-9-specific T cell clone 
(designated PPD 800-10) prepared from PBMC from a healthy PPD-positive donor, 
following stimulation of the T cell clone with secretory proteins, PPD or recombinant TbH-9. 
These results further confirm that TbH-9 is not secreted by M. tuberculosis. 

20 C. Use of Sera From Patients having Extrapulmonary Tuberculosis to Identify 
DNA Sequences Encoding M. tuberculosis Antigens 

Genomic DNA was isolated from M. tuberculosis Erdman strain, randomly 
sheared and used to construct an expression library employing the Lambda ZAP expression 
25 system (Stratagene, La Jolla, CA). The resulting library was screened using pools of sera 
obtained from individuals with extrapulmonary tuberculosis, as described above in Example 
3B, with the secondary antibody being goat anti-human IgG + A + M (H+L) conjugated with 
alkaline phosphatase. 

Eighteen clones were purified. Of these, 4 clones (hereinafter referred to as 
30 XP14, XP24, XP31 and XP32) were found to bear some similarity to known sequences. The 
determined DNA sequences for XP14, XP24 and XP31 are provided in SEQ ID NOS: 151- 
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153, respectively, with the 5' and 3' DNA sequences for XP32 being provided in SEQ ID 
NOS: 154 and 155, respectively. The predicted amino acid sequence for XP14 is provided in 
SEQ ID NO: 156. The reverse complement of XP14 was found to encode the amino acid 
sequence provided in SEQ ID NO: 157. 

5 Comparison of the sequences for the remaining 14 clones (hereinafter referred 

to as XP1-XP6, XP17-XP19, XP22, XP25, XP27, XP30 and XP36) with those in the 
genebank as described above, revealed no homologies with the exception of the 3 ' ends of 
XP2 and XP6 which were found to bear some homology to known M. tuberculosis cosmids. 
The DNA sequences for XP27 and XP36 are shown in SEQ ID NOS: 158 and 159, 

10 respectively, with the 5' sequences for XP4, XP5, XP17 and XP30 being shown in SEQ ID 
NOS: 160-163, respectively, and the 5' and 3' sequences for XP2, XP3, XP6, XP18, XP19, 
XP22 and XP25 being shown in SEQ ID NOS: 164 and 165; 166 and 167; 168 and 169; 170 
and 171; 172 and 173; 174 and 175; and 176 and 177, respectively. XP1 was found to 
overlap with the DNA sequences for TbH4, disclosed above. The full-length DNA sequence 

1 5 for TbH4-XP 1 is provided in SEQ ID NO: 1 78. This DNA sequence was found to contain an 
open reading frame encoding the amino acid sequence shown in SEQ ID NO: 179. The 
reverse complement of TbH4-XPl was found to contain an open reading frame encoding the 
amino acid sequence shown in SEQ ID NO: 180. The DNA sequence for XP36 was found to 
contain two open reading frames encoding the amino acid sequence shown in SEQ ID NOS: 

20 181 and 182, with the reverse complement containing an open reading frame encoding the 
amino acid sequence shown in SEQ ID NO: 183. 

Recombinant XP1 protein was prepared as described above in Example 3B, 
with a metal ion affinity chromatography column being employed for purification. 
Recombinant XP1 was found to stimulate cell proliferation and IFN-y production in T cells 

25 isolated from an M. tuberculosis-immune donors. 

D. Preparation of M. tuberculosis Soluble Antigens using R arrit Anti-sera 

RAISED AGAINST M. TUBERCULOSIS FRACTION ATED PROTEINS 

M. tuberculosis lysate was prepared as described above in Example 2. The 
30 resulting material was fractionated by HPLC and the fractions screened by Western blot for 
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serological activity with a serum pool from M. tuberculosis-infected patients which showed 
little or no immunoreactivity with other antigens of the present invention. Rabbit anti-sera 
was generated against the most reactive fraction using the method described in Example 3 A . 
The anti-sera was used to screen an M. tuberculosis Erdman strain genomic DNA expression 
5 library prepared as described above. Bacteriophage plaques expressing immunoreactive 
antigens were purified. Phagemid from the plaques was rescued and the nucleotide sequences 
of the M tuberculosis clones determined. 

Ten different clones were purified. Of these, one was found to be TbRa35, 
described above, and one was found to be the previously identified M. tuberculosis antigen, 

10 HSP60. Of the remaining eight clones, six (hereinafter referred to as RDIF2, RDIF5 ? RDIF8, 
RDIF10, RDIF11 and RDIF12) were found to bear some similarity to previously identified 
M. tuberculosis sequences. The determined DNA sequences for RDIF2, RDIF5, RDIF8, 
RDIF10 and RDIF11 are provided in SEQ ID NOS: 184-188, respectively, with the 
corresponding predicted amino acid sequences being provided in SEQ ID NOS: 189-193, 

15 respectively. The 5 r and 3' DNA sequences for RDIF12 are provided in SEQ ID NOS: 194 
and 195, respectively. No significant homologies were found to the antigen RDIF-7. The 
determined DNA and predicted amino acid sequences for RDIF7 are provided in SEQ ID 
NOS: 196 and 197, respectively. One additional clone, referred to as RDIF6 was isolated, 
however, this was found to be identical to RDIF5. 

20 Recombinant RDIF6, RDIF8, RDIF10 and RDIF11 were prepared as 

described above. These antigens were found to stimulate cell proliferation and IFN-y 
production in T cells isolated from M tuberculosis-immune donors. 



25 EXAMPLE 4 

Purification and Characterization of a Polypeptide from Tuberculin Purified 

Protein Derivative 



30 



An Af. tuberculosis polypeptide was isolated from tuberculin purified protein 
derivative (PPD) as follows. 
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PPD was prepared as published with some modification (Seibert, F. et al., 
Tuberculin purified protein derivative. Preparation and analyses of a large quantity for 
standard. The American Review of Tuberculosis 44:9-25, 1941). M. tuberculosis Rv strain 
was grown for 6 weeks in synthetic medium in roller bottles at 37°C. Bottles containing the 
bacterial growth were then heated to 100°C in water vapor for 3 hours. Cultures were sterile 
filtered using a 0.22 \x filter and the liquid phase was concentrated 20 times using a 3 kD cut- 
off membrane. Proteins were precipitated once with 50% ammonium sulfate solution and 
eight times with 25% ammonium sulfate solution. The resulting proteins (PPD) were 
fractionated by reverse phase liquid chromatography (RP-HPLC) using a CI 8 column (7.8 x 
300 mM; Waters, Milford, MA) in a Biocad HPLC system (Perseptive Biosystems, 
Framingham, MA). Fractions were eluted from the column with a linear gradient from 0- 
100% buffer (0.1% TFA in acetonitrile). The flow rate was 10 ml/minute and eluent was 

monitored at 214 run and 280 nm. 

Six fractions were collected, dried, suspended in PBS and tested individually 
in M. tuberculosis-infected guinea pigs for induction of delayed type hypersensitivity (DTH) 
reaction. One fraction was found to induce a strong DTH reaction and was subsequently 
fractionated further by RP-HPLC on a microbore Vydac CI 8 column (Cat. No. 218TP51 15) 
in a Perkin Elmer/Applied Biosystems Division Model 172 HPLC. Fractions were eluted 
with a linear gradient from 5-100% buffer (0.05% TFA in acetonitrile) with a flow rate of 80 
ul/minute. Eluent was monitored at 215 nm. Eight fractions were collected and tested for 
induction of DTH in M. tuberculosis-infected guinea pigs. One fraction was found to induce 
strong DTH of about 16 mm induration. The other fractions did not induce detectable DTH. 
The positive fraction was submitted to SDS-PAGE gel electrophoresis and found to contain a 
single protein band of approximately 12 kD molecular weight. 

This polypeptide, herein after referred to as DPPD, was sequenced from the 
amino terminal using a Perkin Elmer/Applied Biosystems Division Procise 492 protein 
sequencer as described above and found to have the N-terminal sequence shown in SEQ ID 
NO:: 124. Comparison of this sequence with known sequences in the gene bank as described 
above revealed no known homologies. Four cyanogen bromide fragments of DPPD were 
isolated and found to have the sequences shown in SEQ ID NOS: 125-128. 
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EXAMPLE 5 
Synthesis of Synthetic Polypeptides 

5 Polypeptides may be synthesized on a Millipore 9050 peptide synthesizer 

using FMOC chemistry with HPTU (O-Benzotriazole-NjNjN'^N'-tetramethyluronium 
hexafluorophosphate) activation. A Gly-Cys-Gly sequence may be attached to the amino 
terminus of the peptide to provide a method of conjugation or labeling of the peptide. 
Cleavage of the peptides from the solid support may be carried out using the following 

10 cleavage mixture: trifluoroacetic acid:ethanedithiol:thioanisole:water:phenol (40:1:2:2:3). 
After cleaving for 2 hours, the peptides may be precipitated in cold methyl-t-butyl-ether. The 
peptide pellets may then be dissolved in water containing 0.1% trifluoroacetic acid (TFA) and 
lyophilized prior to purification by CI 8 reverse phase HPLC. A gradient of 0-60% 
acetonitrile (containing 0.1% TFA) in water (containing 0.1% TFA) may be used to elute the 

15 peptides. Following lyophilization of the pure fractions, the peptides may be characterized 
using electrospray mass spectrometry and by amino acid analysis. 

This procedure was used to synthesize a TbM- 1 peptide that contains one and 
a half repeats of a TbM-1 sequence. The TbM- 1 peptide has the sequence 
GCGDRSGGNLDQIRLRRDRSGGNL (SEQ ID NO: 63). 

20 

EXAMPLE 6 

Use of Representative Antigens for Serodiagnosis of Tuberculosis 

25 This Example illustrates the diagnostic properties of several representative 

antigens. 

Assays were performed in 96-weIl plates were coated with 200 ng antigen 
diluted to 50 \xL in carbonate coating buffer, pH 9.6. The wells were coated overnight at 4°C 
(or 2 hours at 37°C). The plate contents were then removed and the wells were blocked for 2 
30 hours with 200 jiL of PBS/1% BSA. After the blocking step, the wells were washed five 
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times with PBS/0.1% Tween 20™. 50 uL sera, diluted 1:100 in PBS/0.1% Tween 20™/0.1% 
BSA, was then added to each well and incubated for 30 minutes at room temperature. The 
plates were then washed again five times with PBS/0.1% Tween 20™. 

The enzyme conjugate (horseradish peroxidase - Protein A, Zymed, San 

5 Francisco, CA) was then diluted 1:10,000 in PBS/0.1% Tween 20™/0.1% BSA, and 50 u-L of 
the diluted conjugate was added to each well and incubated for 30 minutes at room 
temperature. Following incubation, the wells were washed five times with PBS/0.1% Tween 
20™. 100 uL of tetramethylbenzidine peroxidase (TMB) substrate (Kirkegaard and Perry 
Laboratories, Gaithersburg, MD) was added, undiluted, and incubated for about 15 minutes. 

10 The reaction was stopped with the addition of 100 uL of 1 N H 2 S0 4 to each well, and the 

plates were read at 450 nm. 

Figure 4 shows the ELISA reactivity of two recombinant antigens isolated 
using method A in Example 3 (TbRa3 and TbRa9) with sera from M. tuberculosis positive 
and negative patients. The reactivity of these antigens is compared to that of bacterial lysate 
15 isolated from M. tuberculosis strain H37Ra (Difco, Detroit, MI). In both cases, the 
recombinant antigens differentiated positive from negative sera. Based on cut-off values 
obtained from receiver-operator curves, TbRa3 detected 56 out of 87 positive sera, and 
TbRa9 detected 111 out of 165 positive sera. 

Figure 5 illustrates the ELISA reactivity of representative antigens isolated 
20 using method B of Example 3. The reactivity of the recombinant antigens TbH4, TbH12, 
Tb38-1 and the peptide TbM-1 (as described in Example 4) is compared to that of the 38 kD 
antigen described by Andersen and Hansen, Infect, lmmun. 57:2481-2488, 1989. Again, all 
of the polypeptides tested differentiated positive from negative sera. Based on cut-off values 
obtained from receiver-operator curves, TbH4 detected 67 out of 126 positive sera, TbH12 
25 detected 50 out of 1 25 positive sera, 3 8- 1 detected 6 1 out of 1 0 1 positive sera and the TbM-1 
peptide detected 25 out of 30 positive sera. 

The reactivity of four antigens (TbRa3, TbRa9, TbH4 and TbH12) with sera 
from a group of M tuberculosis infected patients with differing reactivity in the acid fast stain 
of sputum (Smithwick and David, Tubercle 52:226, 1971) was also examined, and compared 
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to the reactivity of M tuberculosis lysate and the 38 kD antigen. The results are presented 
Table 3, below: 

TABLE 3 

Reactivity of Antigens with Sera from M. tuberculosis Patients 



Patient 


Acid 
Fast 

Sputum 


ELISA Values 


Lysate 38kD TbRa9 TbH12 TbH4 TbRa3 


Tb01B93I-2 


I i l l 


1.853 


0.634 


0.998 


1.022 


1.030 


1.314 


Tb01B93M9 


++++ 


2.657 


2.322 


0.608 


0.837 


1.857 


2.335 


Tb01B93I-8 


+++ 


2.703 


0.527 


0.492 


0.281 


0.501 


2.002 


Tb01B93I-10 


-H-h 


1.665 


1.301 


0.685 


0.216 


0.448 


0.458 


Tb01B93I-ll 


+++ 


2.817 


0.697 


0.509 


0.301 


0.173 


2.608 


Tb01B93I-15 


+++ 


1.28 


0.283 


0.808 


0.218 


1.537 


0.811 


Tb01B93I-16 


I I I 


2.908 


>3 


0.899 


0.441 


0.593 


1.080 


I OU1D7J1-ZJ 


-1-4. i 


0.395 


0.131 


0.335 


0.211 


0.107 


0.948 


Tb01B93I-87 


+++ 


2.653 


2.432 


2.282 


0.977 


1.221 


0.857 


Tb01B93I-89 


+-H- 


1.912 


2.370 


2.436 


0.876 


0.520 


0.952 


Tb01B94I-108 


+++ 


1.639 


0.341 


0.797 


0.368 


0.654 


0.798 


Tb01B94I-201 


+++ 


1.721 


0.419 


0.661 


0.137 


0.064 


0.692 


Tb01B93I-88 


++ 


1.939 


1.269 


2.519 


1.381 


0.214 


0.530 


Tb01B93I-92 


++ 


2.355 


2.329 


2.78 


0.685 


0.997 


2.527 


Tb01B94I-109 


++ 


0.993 


0.620 


0.574 


0.441 


0.5 


2.558 


Tb01B94I-210 


++ 


2.777 


>3 


0.393 


0.367 


1.004 


1.315 


Tb01B94I-224 


++ 


2.913 


0.476 


0.251 


1.297 


1.990 


0.256 



WO 98/16645 PCT/US97/18214 

48 



Patient 


Acid 
Fast 

Sputum 


ELISA Values 


Lysate 38kD TbRa9 TbH12 TbH4 TbRa3 


Tb01B93I-9 


+ 


2.649 


0.278 


0.210 


0.140 


0.181 


1.586 


Tb01B93I-14 


+ 


>3 


1.538 


0.282 


0.291 


0.549 


2.880 


Tb01B93I-21 


+ 


2.645 


0.739 


2.499 


0.783 


0.536 


1.770 


Tb01B93I-22 


+ 


0.714 


0.451 


2.082 


0.285 


0.269 


1.159 


Tb01B93I-31 


+ 


0.956 


0.490 


1.019 


0.812 


0.176 


1.293 


Tb01B93I-32 




2.261 


0.786 


0.668 


0.273 


0.535 


0.405 


Tb01B93I-52 




0.658 


0.114 


0.434 


0.330 


0.273 


1.140 


Tb01B93I-99 




2.118 


0.584 


1.62 


0.119 


0.977 


0.729 


Tb01B94I-130 




1.349 


0.224 


0.86 


0.282 


0.383 


2.146 


Tb01B94I-131 




0.685 


0.324 


1.173 


0.059 


0.118 


1.431 


AT4-0070 


Normal 


0.072 


0.043 


0.092 


0.071 


0.040 


0.039 


A 1 4-U 1 Uj 


Normal 


0.397 


0.121 


0.118 


0.103 


0.078 


0.390 


3/15/94-1 


Normal 


0.227 


0.064 


0.098 


0.026 


0.001 


0.228 


4/15/93-2 


Normal 


0.114 


0.240 


0.071 


0.034 


0.041 


0.264 


5/26/94-4 


Normal 


0.089 


0.259 


0.096 


0.046 


0.008 


0.053 


5/26/94-3 


Normal 


0.139 


0.093 


0.085 


0.019 


0.067 


0.01 



Based on cut-off values obtained from receiver-operator curves, TbRa3 
detected 23 out of 27 positive sera, TbRa9 detected 22 out of 27, TbH4 detected 18 out of 27 
and TbH12 detected 15 out of 27. If used in combination, these four antigens would have a 
5 theoretical sensitivity of 27 out of 27, indicating that these antigens should complement each 
other in the serological detection of M tuberculosis infection. In addition, several of the 
recombinant antigens detected positive sera that were not detected using the 38 kD antigen, 
indicating that these antigens may be complementary to the 38 kD antigen. 
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The reactivity of the recombinant antigen TbRal 1 with sera from 
M tuberculosis patients shown to be negative for the 38 kD antigen, as well as with sera from 
PPD positive and normal donors, was determined by ELISA as described above. The results 
are shown in Figure 6 which indicates that TbRal 1, while being negative with sera from PPD 
5 positive and normal donors, detected sera that were negative with the 38 kD antigen. Of the 
thirteen 38 kD negative sera tested, nine were positive with TbRal 1, indicating that this 
antigen may be reacting with a sub-group of 38 kD antigen negative sera. In contrast, in a 
group of 38 kD positive sera where TbRal 1 was reactive, the mean OD 450 for TbRal 1 was 
lower than that for the 38 kD antigen. The data indicate an inverse relationship between the 

10 presence of TbRal 1 activity and 38 kD positivity. 

The antigen TbRa2A was tested in an indirect ELISA using initially 50 of 
serum at 1:100 dilution for 30 minutes at room temperature followed by washing in PBS 
Tween and incubating for 30 minutes with biotinylated Protein A (Zymed, San Francisco, 
CA) at a 1:10,000 dilution. Following washing, 50 jj.1 of streptavidin-horseradish peroxidase 

15 (Zymed) at 1:10,000 dilution was added and the mixture incubated for 30 minutes. After 
washing, the assay was developed with TMB substrate as described above. The reactivity of 
TbRa2A with sera from M tuberculosis patients and normal donors in shown in Table 4. The 
mean value for reactivity of TbRa2A with sera from M tuberculosis patients was 0.444 with 
a standard deviation of 0.309. The mean for reactivity with sera from normal donors was 

20 0.109 with a standard deviation of 0.029. Testing of 38 kD negative sera (Figure 7) also 
indicated that the TbRa2A antigen was capable of detecting sera in this category. 

TABLE 4 

Reactivity of TbRa2A with sera from M tuberculosis Patients and f rom Normal 
25 Donors 



Serum ID 


Status 


OD 450 


Tb85 


TB 


0.680 


Tb86 


TB 


0.450 


Tb87 


TB 


0.263 


Tb88 


TB 


0.275 


Tb89 


TB 


0.403 
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Tb91 


TB 


0.393 


Tb92 


TB 


0.401 


Tb93 


TB 


0.232 


Tb94 


TB 


0.333 


Tb95 


TB 


0.435 


Tb96 


TB 


0.284 


Tb97 


TB 


0.320 


Tb99 


TB 


0.328 


TblOO 


TB 


0.817 


TblOl 


TB 


0.607 


Tbl02 


TB 


0.191 


Tbl03 


TB 


0.228 


Tbl07 


TB 


0.324 


Tbl09 


TB 


1.572 


Tbll2 


TB 


0.338 


DL4-0176 


Normal 


0.036 


AT4-0043 


Normal 


0.126 


AT4-0044 


Normal 


0.130 


AT4-0052 


Normal 


0.135 


AT4-0053 


Normal 


0.133 


AT4-0062 


Normal 


0.128 


AT4-0070 


Normal 


0.088 


AT4-0091 


Normal 


0.108 


AT4-0100 


Normal 


0.106 


AT4-0105 


Normal 


0.108 


AT4-0109 


Normal 


0.105 



The reactivity of the recombinant antigen (g) (SEQ ID NO: 60) with sera from 
M. tuberculosis patients and normal donors was determined by ELISA as described above. 
Figure 8 shows the results of the titration of antigen (g) with four M tuberculosis positive 
5 sera that were all reactive with the 38 kD antigen and with four donor sera. All four positive 
sera were reactive with antigen (g). 

The reactivity of the recombinant antigen TbH-29 (SEQ ID NO: 137) with 
sera from M tuberculosis patients, PPD positive donors and normal donors was determined 
by indirect ELISA as described above. The results are shown in Figure 9. TbH-29 detected 
10 30 out of 60 M. tuberculosis sera, 2 out of 8 PPD positive sera and 2 out of 27 normal sera. 

Figure 10 shows the results of ELISA tests (both direct and indirect) of the 
antigen TbH-33 (SEQ ID NO: 140) with sera from M. tuberculosis patients and from normal 
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donors and with a pool of sera from M. tuberculosis patients. The mean OD 450 was 
demonstrated to be higher with sera from M. tuberculosis patients than from normal donors, 
with the mean OD 450 being significantly higher in the indirect ELISA than in the direct 
ELISA. Figure 1 1 is a titration curve for the reactivity of recombinant TbH-33 with sera 
5 from M. tuberculosis patients and from normal donors showing an increase in OD 450 with 
increasing concentration of antigen. 

The reactivity of the recombinant antigens RDIF6, RDIF8 and RDIF10 (SEQ 
ID NOS: 184-187, respectively) with sera from M, tuberculosis patients and normal donors 
was determined by ELISA as described above. RDIF6 detected 6 out of 32 M tuberculosis 
10 sera and 0 out of 15 normal sera; RDIF8 detected 14 out of 32 M. tuberculosis sera and 0 out 
of 1 5 normal sera; and RDIF 1 0 detected 4 out of 27 M. tuberculosis sera and 1 out of 1 5 
normal sera. In addition, RDIF 10 was found to detect 0 out of 5 sera from PPD-positive 
donors. 

15 EXAMPLE 7 

Preparation and Characterization of M. Tuberculosis Fusion Proteins 

A fusion protein containing TbRa3, the 38 kD antigen and Tb38-1 was 
prepared as follows. 

20 Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified by PCR 

in order to facilitate their fusion and the subsequent expression of the fusion protein TbRa3- 
38 kD-Tb38-l. TbRa3, 38 kD and Tb38-1 DNA was used to perform PCR using the primers 
PDM-64 and PDM-65 (SEQ ID NO: 141 and 142), PDM-57 and PDM-58 (SEQ ID NO: 143 
and 144), and PDM-69 and PDM-60 (SEQ ID NO: 145-146), respectively. In each case, the 

25 DNA amplification was performed using 10 \x\ 10X Pfu buffer, 2 j^l 10 mM dNTPs, 2 \i\ each 
of the PCR primers at 10 j^M concentration, 81.5 p.1 water, 1.5 \i\ Pfu DNA polymerase 
(Stratagene, La Jolla, CA) and 1 ^il DNA at either 70 xigl\\\ (for TbRa3) or 50 ng/(il (for 38 
kD and Tb38-1). For TbRa3, denaturation at 94°C was performed for 2 min, followed by 40 
cycles of 96°C for 15 sec and 72°C for 1 min, and lastly by 72°C for 4 min. For 38 kD, 

30 denaturation at 96°C was performed for 2 min, followed by 40 cycles of 96°C for 30 sec, 
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68°C for 15 sec and 72°C for 3 min, and finally by 72°C for 4 min. For Tb38-1 denaturation 
at 94°C for 2 min was followed by 10 cycles of 96°C for 15 sec, 68°C for 15 sec and 72°C for 
1.5 min, 30 cycles of 96°C for 15 sec, 64°C for 15 sec and 72°C for 1.5, and finally by 72°C 
for 4 min. 

5 The TbRa3 PCR fragment was digested with Ndel and EcoRI and cloned 

directly into pT7 A L2 IL 1 vector using Ndel and EcoRI sites. The 38 kD PCR fragment was 
digested with Sse8387I, treated with T4 DNA polymerase to make blunt ends and then 
digested with EcoRI for direct cloning into the pT7 A L2Ra3-l vector which was digested with 
StuI and EcoRI. The 38-1 PCR fragment was digested with Eco47III and EcoRI and directly 

10 subcloned into pT7 A L2Ra3/38kD-17 digested with the same enzymes. The whole fusion was 
then transferred to pET28b using Ndel and EcoRI sites. The fusion construct was confirmed 
by DNA sequencing. 

The expression construct was transformed to BLR pLys S E. coli (Novagen, 
Madison, WI) and grown overnight in LB broth with kanamycin (30 p,g/ml) and 

15 chloramphenicol (34 ng/ml). This culture (12 ml) was used to inoculate 500 ml 2XYT with 
the same antibiotics and the culture was induced with IPTG at an OD560 of 0.44 to a final 
concentration of 1.2 mM. Four hours post-induction, the bacteria were harvested and 
sonicated in 20 mM Tris (8.0), 100 mM NaCl, 0.1% DOC, 20 jag/ml Leupeptin, 20 mM 
PMSF followed by centrifugation at 26,000 X g. The resulting pellet was resuspended in 8 M 

20 urea, 20 mM Tris (8.0), 100 mM NaCl and bound to Pro-bond nickel resin (Invitrogen, 
Carlsbad, CA). The column was washed several times with the above buffer then eluted with 
an imidazole gradient (50 mM, 100 mM, 500 mM imidazole was added to 8 M urea, 20 mM 
Tris (8.0), 100 mM NaCl). The eluates containing the protein of interest were then dialzyed 
against 10 mM Tris (8.0). 

25 The DNA and amino acid sequences for the resulting fusion protein 

(hereinafter referred to as TbRa3-38 kD-Tb38-l) are provided in SEQ ID NO: 147 and 148, 
respectively. 

A fusion protein containing the two antigens TbH-9 and Tb38-1 (hereinafter 
referred to as TbH9-Tb38-l) without a hinge sequence, was prepared using a similar 
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procedure to that described above. The DNA sequence for the TbH9-Tb38-l fusion protein is 
provided in SEQ ID NO: 151. 

A fusion protein containing TbRa3, the antigen 38kD, Tb38-1 and DPEP was 
prepared as follows. 

5 Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified by PCR 

and cloned into vectors essentially as described above, with the primers PDM-69 (SEQ ID 
NO:145 and PDM-83 (SEQ ID NO: 200) being used for amplification of the Tb38-1A 
fragment. Tb38-1A differs from Tb38-1 by a Dral site at the 3' end of the coding region that 
keeps the final amino acid intact while creating a blunt restriction site that is in frame. The 

10 TbRa3/3 8kD/Tb3 8- 1 A fusion was then transferred to pET28b using Ndel and EcoRl sites. 

DPEP DNA was used to perform PCR using the primers PDM-84 and PDM- 
85 (SEQ ID NO: 201 and 202, respectively) and 1 |il DNA at 50 ng/pl Denaturation at 94 °C 
was performed for 2 min, followed by 10 cycles of 96 °C for 15 sec, 68 °C for 15 sec and 72 
°C for 1.5 min; 30 cycles of 96 °C for 15 sec, 64 °C for 15 sec and 72 °C for 1.5 min; and 

15 finally by 72 °C for 4 min. The DPEP PCR fragment was digested with EcoRI and Eco72I 
and clones directly into the pET28Ra3/38kD/38-l A construct which was digested with Dral 
and EcoRI. The fusion construct was confirmed to be correct by DNA sequencing. 
Recombinant protein was prepared as described above. The DNA and amino acid sequences 
for the resulting fusion protein (hereinafter referred to as TbF-2) are provided in SEQ ID NO: 

20 203 and 204, respectively. 

EXAMPLE 8 
Use of M Tuberculosis Fusion Proteins for 
Serodiagnosis of Tuberculosis 

25 

The effectiveness of the fusion protein TbRa3-38 kD-Tb38-l, prepared as 
described above, in the serodiagnosis of tuberculosis infection was examined by ELISA. 

The ELISA protocol was as described above in Example 6, with the fusion 
protein being coated at 200 ng/well. A panel of sera was chosen from a group of tuberculosis 
30 patients previously shown, either by ELISA or by western blot analysis, to react with each of 
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the three antigens individually or in combination. Such a panel enabled the dissection of the 
serological reactivity of the fusion protein to determine if all three epitopes functioned with 
the fusion protein. As shown in Table 5, all four sera that reacted with TbRa3 only were 
detectable with the fusion protein. Three sera that reacted only with Tb38-1 were also 
5 detectable, as were two sear that reacted with 38 kD alone. The remaining 15 sera were all 
positive with the fusion protein based on a cut-off in the assay of mean negatives +3 standard 
deviations. This data demonstrates the functional activity of all three epitopes in the fusion 
protein. 



10 Tables 

Reactivity of Tri-Peptide Fusion Protein with Sera from M. tuberculosis Patients 



Serum ID 


Status 


ELISA and/or Western Blot 


Fusion 


Fusion 






Reactivity with Individual proteins 


recomoinani 


xveconi oinant 






38kd 


Tb38-1 


TbRa3 


OD 450 


Status 


01B93I-40 


TB 






+ 


0.413 


+ 


01B93I-41 


TB 




+ 




0.392 




01B93I-29 


TB 


+ 






2.217 




01B93I-109 


TB 


+ 


± 


+ 


0.522 


+ 


01B93I-132 


TB 




+ 


+ 


0.937 




5004 


TB 


± 


+ 


± 


1.098 




15004 


TB 


+ 


+ 




2.077 


+ 


39004 


TB 


+ 


+ 


+ 


1.675 


+ 


68004 


TB 


+ 


+ 


+ 


2.388 


-f 


99004 


TB 






± 


0.607 




107004 


TB 




+ 


± 


0.667 




92004 


TB 


+ 


± 


± 


1.070 


+ 


97004 


TB 


+ 




db 


1.152 


+ 


118004 


TB 


+ 




± 


2.694 


+ 


173004 


TB 


+ 


+ 




3.258 




175004 


TB 








2.514 




274004 


TB 






+ 


3.220 


+ 


276004 


TB 




+ 




2.991 




282004 


TB 


+ 






0.824 


+ 
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289004 


TB 


- 


- 


+ 


0.848 


+ 


308004 


TB 


- 


+ 


- 


3.338 


+ 


314004 


TB 


- 


+ 


- 


1.362 


+ 


317004 


TB 


+ 


- 


- 


0.763 


+ 


312004 


TB 


- 


- 


+ 


1.079 


+ 


D176 


PPD 


- 


- 


- 


0.145 


- 


D162 


PPD 


- 


- 


- 


0.073 


- 


vD161 


PPD 


- 


- 


- 


0.097 


. - 


D27 


PPD 


- 


- 


- 


0.082 


- 


A6-124 


NORMAL 


- 


- 


- 


0.053 


- 


A6-125 


NORMAL 


- 


- 


- 


0.087 


- 


A6-126 


NORMAL 


- 


- 


- 


0.346 


± 


A6-127 


NORMAL 


- 


- 


- 


0.064 


- 


A6-128 


NORMAL 


- 


- 


- 


0.034 


- 


A6-129 


NORMAL 


- 


- 


- 


0.037 


- 


A6-130 


NORMAL 


- 


- 


- 


0.057 


- 


A6-131 


NORMAL 


- 


- 


- 


0.054 


- 


A6-132 


NORMAL 


- 


- 




0.022 


- 


A6-133 


NORMAL 


- 


- 




0.147 


- 


A6-134 


NORMAL 


- 


- 


- 


0.101 


- 


A6-135 


NORMAL 


- 


- 




0.066 


- 


A6-136 


NORMAL 








0.054 




A6-137 


NORMAL 








0.065 




A6-138 


NORMAL 








0.041 




A6-139 


NORMAL 








0.103 




A6-140 


NORMAL 








0.212 




A6-141 


NORMAL 








0.056 




A6-142 


NORMAL 








0.051 





The reactivity of the fusion protein TbF-2 with sera from M. tuberculosis- 
infected patients was examined by ELISA using the protocol described above. The results of 
these studies (Table 6) demonstrate that all four antigens function independently in the fusion 
5 protein. 



o o 



WO 98/16645 PCT/US97/18214 

56 



Table 6 

Reactivity of TbF-2 Fusion Protein with TB and Normal Sera 



Serum ID 


Status 


TbF 
OD450 


Status 


TbF-2 
OD450 


Status 


ELISA Reactivity 














38 kD 


TbRa3 


Tb38-1 


DPEP 


B931-40 


TB 


0.57 


+ 


0.321 


+ 


- 




- 




B931-41 


TB 


0.601 


+ 


0.396 


+ 


+ 




+ 


- 


B931-109 


TB 


0.494 


+ 


0.404 


+ 


+ 




+ 


- 


B931-132 


TB 


1.502 




1.292 


+ 


+ 


+ 




± 


5004 


TB 


1.806 




1.666 


4- 


± 


± 


+ 


- 


15004 


TB 


2.862 


+ 


2.468 


+ 


+ 




+ 


- 


39004 


TB 


2.443 


+ 


1.722 


+ 




+ 


+ 


- 


68004 


TB 


2.871 


+ 


2.575 


+ 


+ 


+ 


+ 


- 


99004 


TB 


0.691 


+ 


0.971 


+ 


- 


± 


+ 


- 


107004 


TB 


0.875 




0.732 


4- 


- 


± 


+ 


- 


92004 


TB 


1.632 


+ 


1.394 


+ 


+ 


± 


+ 


- 


97004 


TB 


1.491 


4- 


1.979 


+ 


+ 


± 


- 


+ 


118004 


TB 


3.182 




3.045 


+ 


+ 


± 


- 


- 


173004 


TB 


3.644 


+ 


3.578 


+ 


+ 


+ 


+ 


- 


1 75004 


TB 


3.332 


+ 


2.916 


+ 


+ 


+ 


- 


- 


274004 


TB 


3.696 


4* 


3.716 


+ 


- 


+ 


- 


+ 


276004 


TB 


3.243 


4- 


2.56 


+ 


- 


- 


+ 


- 


282004 


TB 


1.249 




1.234 


+ 




- 


- 


- 


289004 


TB 


1.373 


+ 


1.17 


+ 


- 


+ 


- 


- 


308004 


TB 


3.708 


+ 


3.355 


+ 


- 


- 


+ 


- 


314004 


TB 


1.663 


-t- 


1.399 


+ 


- 


- 


+ 


- 


317004 


TB 


1.163 


4- 


0.92 


+ 




- 


- 


- 


312004 


TB 


1.709 


+ 


1.453 


+ 


- 


+ 


- 


- 


380004 


TB 


0.238 


- 


0.461 


+ 


- 


± 


- 




451004 


TB 


0.18 


- 


0.2 


- 


- 


- 


- 


± 


478004 


TB 


0.188 


- 


0.469 


+ 










410004 


TB 


0.384 


+ 


2.392 


+ 


± 








411004 


TB 


0.306 


+ 


0.874 


+ 




+ 






421004 


TB 


0.357 


+ 


1.456 






+ 




+ 


528004 


TB 


0,047 




0.196 










+ 


A6-87 


Normal 


0.094 




0.063 












A6-88 


Normal 


0.214 




0.19 












A6-89 


Normal 


0.248 




0.125 












A6-90 


Normal 


0.179 




0.206 












A6-91 


Normal 


0.135 




0.151 












A6-92 


Normal 


0.064 




0.097 












A6-93 


Normal 


0.072 




0.098 












A6-94 


Normal 


0.072 




0.064 












A6-95 


Normal 


0.125 




0.159 












A6-96 


Normal 


0.121 




0.12 
































Cut-off 




0.284 




0.266 
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One of skill in the art will appreciate that the order of the individual antigens 
within the fusion protein may be changed and that comparable activity would be expected 
provided each of the epitopes is still functionally available. In addition, truncated forms of 
the proteins containing active epitopes may be used in the construction of fusion proteins. 

From the foregoing, it will be appreciated that, although specific embodiments 
of the invention have been described herein for the purpose of illustration, various 
modifications may be made without deviating from the spirit and scope of the invention. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANTS: Reed, Steven G. 

Skeiky, Yasir A.W. 
Dillon, Davin C. 
Campos-Neto, Antonia 
Houghton, Raymond 
Vedvick, Thomas S. 
Twardzik, Daniel R. 
Lodes, Michael J. 

(ii) TITLE OF INVENTION: COMPOUNDS AND METHODS FOR DIAGNOSIS OF 

TUBERCULOSIS 

(iii) NUMBER OF SEQUENCES: 209 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY LLP 

(B) STREET: 6300 Columbia Center, 701 Fifth Avenue 

(C) CITY: Seattle 

(D) STATE: Washington 

(E) COUNTRY: USA 

(F) ZIP: 98104-7092 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 01-OCT-1997 

(C) CLASSIFICATION: 

.(.viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Maki, David J. 

(B) REGISTRATION NUMBER: 31,392 

(C) REFERENCE/DOCKET NUMBER: 210121. 4 17C7 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (206) 622-4900 

(B) TELEFAX: (206) 682-6031 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 66 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(2) INFORMATION FOR SEQ ID NO : 4 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xij SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

CGGCACGAGA GACCGATGCC GCTACCCTCG CGCAGGAGGC AGGTAATTTC GAGCGGATCT 60 

CCGGCGACCT GAAAACCCAG ATCGACCAGG TGGAGTCGAC GGCAGGTTCG TTGCAGGGCC 12 0 

AGTGGCGCGG CGCGGCGGGG ACGGCCGCCC AGGCCGCGGT GGTGCGCTTC CAAGAAGCAG 18 0 

CCAATAAGCA GAAGCAGGAA CTCGACGAGA TCTCGACGAA TATTCGTCAG GCCGGCGTCC 24 0 

AATACTCGAG GGCCGACGAG GAGCAGCAGC AGGCGCTGTC CTCGCAAATG GGCTTCTGAC 300 

CCGCTAATAC GAAAAGAAAC GGAGCAA 32 7 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
CGGTCGCGAT GATGGCGTTG TCGAACGTGA CCGATTCTGT ACCGCCGTCG T T GAG AT C AA 60 
CCAACAACGT GTTGGCGTCG GCAAATGTGC CGNACCCGTG GATCTCGGTG ATCTTGTTCT 120 
TCTTCATCAG GAAGTGCACA CCGGCCACCC TGCCCTCGGN TACCTTTCGG 170 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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Ala Asp Gin Ala 
20 

Ser Met Ala Ala 
35 

Thr Lys Glu Gly 
50 

Gly Arg Leu Val 
65 



Arg Ala Gly Gly 



Met Lys Pro Arg 
40 

Arg Gly lie Val 
55 

Val Glu Leu Thr 
70 



Pro Ala Arg lie 
25 

Thr Gly Asp Gly 



Met Arg Val Pro 
60 

Pro Asp Glu Ala 
75 



Trp Arg Glu His 
30 

Pro Leu Glu Ala 
45 

Leu Glu Gly Gly 



Ala Ala Leu Gly 
80 



Asp Glu Leu Lys Gly Val Thr Ser 
85 



(2) INFORMATION FOR SEQ ID NO : 8 9 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 9 : 

Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He 
15 10 15 

Ser Gly Asp Leu Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly 
20 25 30 

Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala 
35 40 45 

Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu 
50 55 60 

Asp Glu He Ser Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg 

65 70 75 80 

Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
85 $0 95 



(2) INFORMATION FOR SEQ ID NO: 90: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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Gin Gin Leu Ala Gin Pro Thr Lys Ser lie Trp Pro Phe Asp Gin Leu 
210 215 220 

Ser Glu Leu Trp Lys Ala lie Ser Pro His Leu Ser Pro Leu Ser Asn 
225 230 235 240 

lie Val Ser Met Leu Asn Asn His Val Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Ala Ser Thr Leu His Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Gin Asn Gly Val Gin Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
290 295 300 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
305 310 315 320 

Leu Ser Val Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325. 330 335 

Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 345 350 

Ala Pro Gly His Met Leu Gly Gly Leu Pro Leu Gly Gin Leu Thr Asn 
355 360 365 

Ser Gly Gly Gly Phe Gly Gly Val Ser Asn Ala Leu Arg Met Pro Pro 
370 375 380 

Arg Ala Tyr Val Met Pro Arg Val Pro Ala Ala Gly 
385 390 '395 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1616 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

CATCGGAGGG AGTGATCACC ATGCTGTGGC ACGCAATGCC ACCGGAGTAA ATACCGCACG 60 

GCTGATGGCC GGCGCGGGTC CGGCTCCAAT GCTTGCGGCG GCCGCGGGAT GGCAGACGCT 120 

TTCGGCGGCT CTGGACGCTC AGGCCGTCGA GTTGACCGCG CGCCTGAACT CTCTGGGAGA 180 
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AGCCTGGACT GGAGGTGGCA GCGACAAGGC GCTTGCGGCT GCAACGCCGA TGGTGGTCTG 24 0 

GCTACAAACC GCGTCAACAC AGGCCAAGAC CCGTGCGATG CAGGCGACGG CGCAAGCCGC 300 

GGCATACACC CAGGCCATGG CCACGACGCC GTCGCTGCCG GAGATCGCCG CCAACCACAT 3 60 

CACCCAGGCC GTCCTTACGG CCACCAACTT CTTCGGTATC AACACGATCC CGATCGCGTT 4 20 

GACCGAGATG GATTATTTCA TCCGTATGTG GAACCAGGCA GCCCTGGCAA TGGAGGTCTA 4 80 

CCAGGCCGAG ACCGCGGTTA ACACGCTTTT CGAGAAGCTC GAGCCGATGG CGTCGATCCT 54 0 

TGATCCCGGC GCGAGCCAGA GCACGACGAA CCCGATCTTC GGAATGCCCT CCCCTGGCAG 600 

CTCAACACCG GTTGGCCAGT TGCCGCCGGC GGCTACCCAG ACCCTCGGCC AACTGGGTGA 6 60 

GATGAGCGGC CCGATGGAGC AGCTGACCCA GCCGCTGCAG CAGGTGACGT CGTTGTTCAG 7 20 

CCAGGTGGGC GGCACCGGCG GCGGCAACCC AGCCGACGAG GAAGCCGCGC AGATGGGCCT 7 80 

GCTCGGCACC AGTCCGCTGT CGAACCATCC GCTGGCTGGT GGATCAGGCC CCAGCGCGGG 84 0 

CGCGGGCCTG CTGCGCGCGG AGTCGCTACC TGGCGCAGGT GGGTCGTTGA CCCGCACGCC 900 

GCTGATGTCT CAGCTGATCG AAAAGCCGGT TGCCCCCTCG GTGATGCCGG CGGCTGCTGC 960 

CGGATCGTCG GCGACGGGTG GCGCCGCTCC GGTGGGTGCG GGAGCGATGG GCCAGGGTGC 102 0 

GCAATCCGGC GGCTCCACCA GGCCGGGTCT GGTCGCGCCG GCACCGCTCG CGCAGGAGCG 10 8 0 

TGAAGAAGAC GACGAGGACG ACTGGGACGA AGAGGACGAC TGGTGAGCTC CCGTAATGAC 114 0 

AACAGACTTC CCGGCCACCC GGGCCGGAAG ACTTGCCAAC ATTTTGGCGA GGAAGGTAAA 1200 

GAGAGAAAGT AGTCCAGCAT GGCAGAGATG AAGACCGATG CCGCTACCCT CGCGCAGGAG 12 60 

GCAGGTAATT TCGAGCGGAT CTCCGGCGAC CTGAAAACCC AGATCGACCA GGTGGAGTCG 1320 

ACGGCAGGTT CGTTGCAGGG CCAGTGGCGC GGCGCGGCGG GGACGGCCGC CCAGGCCGCG 138 0 

GTGGTGCGCT TCCAAGAAGC AGCCAATAAG CAGAAGCAGG AACTCGACGA GATCTCGACG 14 4 0 

AATATTCGTC AGGCCGGCGT CCAATACTCG AGGGCCGACG AG GAG C AG C A GCAGGCGCTG 1500 

TCCTCGCAAA TGGGCTTCTG ACCCGCTAAT ACGAAAAGAA ACGGAGCAAA AAC AT G AC AG 15 60 

AGCAGCAGTG GAATTTCGCG GGTATCGAGG CCGCGGCAAG CGCAATCCAG GGAAAT 1616 

(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 432 base pairs 
( 8 ) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:108: 
CTAGTGGATG GGACCATGGC CATTTTCTGC AGTCTCACTG CCTTCTGTGT TGACATTTTG 60 

GCACGCCGGC GGAAACGAAG CACTGGGGTC GAAGAACGGC TGCGCTGCCA TATCGTCCGG 120 

AGCTTCCATA CCTTCGTGCG GCCGGAAGAG CTTGTCGTAG TCGGCCGCCA TGACAACCTC 180 

TCAGAGTGCG CTCAAACGTA TAAACACGAG AAAGGGCGAG ACCGACGGAA GGTCGAACTC 24 0 

GCCCGATCCC GTGTTTCGCT ATTCTACGCG AACTCGGCGT TGCCCTATGC GAACATCCCA 300 

GTGACGTTGC CTTCGGTCGA AGCCATTGCC TGACCGGCTT CGCTGATCGT CCGCGCCAGG 3 60 

TTCTGCAGCG CGTTGTTCAG CTCGGTAGCC GTGGCGTCCC ATTTTTGCTG GACACCCTGG 4 20 

TACGCCTCCG AA 4 32 
(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

{C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Met Leu Trp His Ala Met Pro Pro Glu Xaa Asn Thr Ala Arg Leu Met 
15 10 15 

Ala Gly Ala Gly Pro Ala Pro Met Leu Ala Ala Ala Ala Gly Trp Gin 
20 25 30 

Thr Leu Ser Ala Ala Leu Asp Ala Gin Ala Val Glu Leu Thr Ala Arg 
35 40 • 45 

Leu Asn Ser Leu Gly Glu Ala Trp Thr Gly Gly Gly Ser Asp Lys Ala 
50 . 55 60 

Leu Ala Ala Ala Thr Pro Met Val Val Trp Leu Gin Thr Ala Ser Thr 
65 70 75 80 

Gin Ala Lys Thr Arg Ala Met* Gin Ala Thr Ala Gin Ala Ala Ala Tyr 
85 90 95 
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Thr Gin Ala Met Ala Thr Thr Pro Ser Leu Pro Glu lie Ala Ala Asn 
100 105 110 

His lie Thr Gin Ala Val Leu Thr Ala Thr Asn Phe Phe Gly lie Asn 
115 120 125 

Thr lie Pro lie Ala Leu Thr Glu Met Asp Tyr Phe lie Arg Met Trp 
130 135 140 

Asn Gin Ala Ala Leu Ala Met Glu Val Tyr Gin Ala Glu Thr Ala Val 
145 150 155 160 

Asn Thr Leu Phe Glu Lys Leu Glu Pro Met Ala Ser lie Leu Asp Pro 
165 170 175 

Gly Ala Ser Gin Ser Thr Thr Asn Pro He Phe Gly Met Pro Ser Pro 
180 185 190 

Gly Ser Ser Thr Pro Val Gly Gin Leu Pro Pro Ala Ala Thr Gin Thr 
195 200 205 

Leu Gly Gin Leu Gly Glu Met Ser Gly Pro Met Gin Gin Leu Thr Gin 
210 215 220 

Pro Leu Gin Gin Val Thr Ser Leu Phe Ser Gin Val Gly Gly Thr Gly 
225 230 235 240 

Gly Gly Asn Pro Ala Asp Glu Glu Ala Ala Gin Met Gly Leu Leu Gly 
245 250 255 

Thr Ser Pro Leu Ser Asn His Pro Leu Ala Gly Gly Ser Gly Pro Ser 
260 265 270 

Ala Gly Ala Gly Leu Leu Arg Ala Glu Ser Leu Pro Gly Ala Gly Gly 
275 280 285 

Ser Leu Thr Arg Thr Pro Leu Met Ser Gin Leu He Glu Lys Pro Val 
290 295 300 

Ala Pro Ser Val Met Pro Ala Ala Ala Ala Gly Ser Ser Ala Thr Gly 
305 310 315 320 

Gly Ala Ala Pro Val Gly Ala Gly Ala Met Gly Gin Gly Ala Gin Ser 
325 330 335 

Gly Gly Ser Thr Arg Pro Gly Leu Val Ala Pro Ala Pro Leu Ala Gin 
340 345 350 

Glu Arg Glu Glu Asp Asp Glu Asp Asp Trp Asp Glu Glu Asp Asp Trp 
355 360 365 



(2) INFORMATION FOR SEQ ID NO: 110: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Met Ala Glu Met Lys Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly 
15 10 15 

Asn Phe Glu Arg lie Ser Gly Asp Leu Lys Thr Gin lie Asp Gin Val 
20 25 30 

Glu Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly 
35 40 45 



Thr Ala Ala Gin Ala Ala Val Val 
50 55 

Gin Lys Gin Glu Leu Asp Glu lie 
65 70 

Val Gin Tyr Ser Arg Ala Asp Glu 
85 



Arg Phe Gin Glu Ala Ala Asn Lys 
60 

Ser Thr Asn lie Arg Gin Ala Gly 
75 80 

Glu Gin Gin Gin Ala Leu Ser Ser 
90 95 



Gin Met Gly Phe 
100 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

GATCTCCGGC GACCTGAAAA CCCAGATCGA CCAGGTGGAG TCGACGGCAG GTTCGTTGCA 60 

GGGCCAGTGG CGCGGCGCGG CGGGGACGGC CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA 120 

AGCAGCCAAT AAGCAGAAGC AGGAACTCGA CGAGATCTCG ACGAATATTC GTCAGGCCGG 180 

CGTCCAATAC TCGAGGGCCG ACGAGGAGCA GCAGCAGGCG CTGTCCTCGC AAATGGGCTT 240 
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CTGACCCGCT AATACGAAAA GAAACGGAGC AAAAACATGA CAGAGCAGCA GTGGAATTTC 300 
GCGGGTATCG AGGCCGCGGC AAGCGCAATC CAGGGAAATG TCACGTCCAT TCATTCCCTC 3 60 



CTTGACGAGG GGAAGCAGTC CCTGACCAAG CTCGCA 396 

(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 
{ B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

lie Ser Gly Asp Leu Lys Thr Gin lie Asp Gin Val Glu Ser Thr Ala 
15 10 15 

Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin 
20 25 30 

Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu 
35 40 45 

Leu Asp Glu lie Ser Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser 
50 55 60 

Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
65 70 75 80 



(2) INFORMATION FOR SEQ ID NO: 113: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

GTGGATCCCG ATCCCGTGTT TCGCTATTCT ACGCGAACTC GGCGTTGCCC TATGCGAACA 60 

TCCCAGTGAC GTTGCCTTCG GTCGAAGCCA TTGCCTGACC GGCTTCGCTG ATCGTCCGCG 120 

CCAGGTTCTG CAGCGCGTTG TTCAGCTCGG TAGCCGTGGC GTCCCATTTT TGCTGGACAC 180 
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CCTGGTACGC CTCCGAACCG CTACCGCCCC AGGCCGCTGC GAGCTTGGTC AGGGACTGCT 



240 



TCCCCTCGTC AAGGAGGGAA TGAATGGACG TGACATTTCC CTGGATTGCG CTTGCCGCGG 



300 



CCTCGATACC CGCGAAATTC CACTGCTGCT CTGTCATGTT TTTGCTCCGT TTCTTTTCGT 



360 



ATTAGCGGGT CAGAAGCCCA TTTGCGA 



387 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

CGGCACGAGG ATCTCGGTTG GCCCAACGGC GCTGGCGAGG GCTCCGTTCC GGGGGCGAGC 60 

TGCGCGCCGG ATGCTTCCTC TGCCCGCAGC CGCGCCTGGA TGGATGGACC AGTTGCTACC 120 

TTCCCGACGT TTCGTTCGGT GTCTGTGCGA TAGCGGTGAC CCCGGCGCGC ACGTCGGGAG 180 

TGTTGGGGGG CAGGCCGGGT CGGTGGTTCG GCCGGGGACG CAGACGGTCT GGACGGAACG 24 0 

GGCGGGGGTT CGCCGATTGG CATCTTTGCC CA 27 2 
(2) INFORMATION FOR SEQ ID NO: 115: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Asp Pro Val Asp Ala Val lie Asn Thr Thr Cys Asn Tyr Gly Gin Val 
15 10 15 

Val Ala Ala Leu 



20 



(2) INFORMATION FOR SEQ ID NO: 116: 
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(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 
GAGAGAATTC TCAGAAGCCC ATTTGCGAGG ACA 
(2) INFORMATION FOR SEQ ID NO: 14 7; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B ) LOCATION: 152.. 1273 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 120 

GCGGAAATTG AAGAGCACAG AAAGGTATGG C GTG AAA ATT CGT TTG CAT ACG 17 2 

Val Lys lie Arg Leu His Thr 
1 5 

CTG TTG GCC GTG TTG ACC GCT GCG CCG CTG CTG CTA GCA GCG GCG GGC 220 
Leu Leu Ala Val Leu Thr Ala Ala Pro Leu Leu Leu Ala Ala Ala Gly 
10 15 20 



TGT GGC TCG AAA CCA CCG AGC GGT TCG CCT GAA ACG GGC GCC GGC GCC 
Cys Gly Ser Lys Pro Pro Ser Gly Ser Pro Glu Thr Gly Ala Gly Ala 
25 30 35 



268 



GGT ACT GTC GCG ACT ACC CCC GCG TCG TCG CCG GTG ACG TTG GCG GAG 316 
Gly Thr Val Ala Thr Thr Pro Ala Ser Ser Pro Val Thr Leu Ala Glu 
40 45 50 55 



ACC GGT AGC ACG CTG CTC TAC CCG CTG TTC AAC CTG TGG GGT CCG GCC 
Thr Gly Ser Thr Leu Leu Tyr Pro Leu Phe Asn Leu Trp Gly Pro Ala 
60 65 70 
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TTT CAC GAG AGG TAT CCG AAC GTC ACG ATC ACC GCT CAG GGC ACC GGT 412 
Phe His Glu Arg Tyr Pro Asn Val Thr lie Thr Ala Gin Gly Thr Gly 
75 80 85 

TCT GGT GCC GGG ATC GCG CAG GCC GCC GCC GGG ACG GTC AAC ATT GGG 4 60 

Ser Gly Ala Gly lie Ala Gin Ala Ala Ala Gly Thr Val Asn lie Gly 
90 95 100 

GCC TCC GAC GCC TAT CTG TCG GAA GGT GAT ATG GCC GCG CAC AAG GGG 508 
Ala Ser Asp Ala Tyr Leu Ser Glu Gly Asp Met Ala Ala His Lys Gly 
105 110 115 

CTG ATG AAC ATC GCG CTA GCC ATC TCC GCT CAG CAG GTC AAC TAC AAC 55 6 

Leu Met Asn lie Ala Leu Ala lie Ser Ala Gin Gin Val Asn Tyr Asn 
120 125 130 135 

CTG CCC GGA GTG AGC GAG CAC CTC AAG CTG AAC GGA AAA GTC CTG GCG 60 4 

Leu Pro Gly Val Ser Glu His Leu Lys Leu Asn Gly Lys Val Leu Ala 
140 145 150 

GCC ATG TAC CAG GGC ACC ATC AAA ACC TGG GAC GAC CCG CAG ATC GCT 652 
Ala Met Tyr Gin Gly Thr lie Lys Thr Trp Asp Asp Pro Gin lie Ala 
155 160 165 

GCG CTC AAC CCC GGC GTG AAC CTG CCC GGC ACC GCG GTA GTT CCG CTG 700 
Ala Leu Asn Pro Gly Val Asn Leu Pro Gly Thr Ala Val Val Pro Leu 
170 175 180 

CAC CGC TCC GAC GGG TCC GGT GAC ACC TTC TTG TTC ACC CAG TAC CTG 74 8 

His Arg Ser Asp Gly Ser Gly Asp Thr Phe Leu Phe Thr Gin Tyr Leu 
185 190 195 

TCC AAG CAA GAT CCC GAG GGC TGG GGC AAG TCG CCC GGC TTC GGC ACC 7 96 

Ser Lys Gin Asp Pro Glu Gly Trp Gly Lys Ser Pro Gly Phe Gly Thr 
200 205 210 215 

ACC GTC GAC TTC CCG GCG GTG CCG GGT GCG CTG GGT GAG AAC GGC AAC 84 4 

Thr Val Asp Phe Pro Ala Val Pro Gly Ala Leu Gly Glu Asn Gly Asn 
220 225 230 

GGC GGC ATG GTG ACC GGT TGC GCC GAG ACA CCG GGC TGC GTG GCC TAT 8 92 

Gly Gly Met Val Thr Gly Cys Ala Glu Thr Pro Gly Cys Val Ala Tyr 
235 240 245 

ATC GGC ATC AGC TTC CTC GAC CAG GCC AGT CAA CGG GGA CTC GGC GAG 94 0 

lie Gly lie Ser Phe Leu Asp Gin Ala Ser Gin Arg Gly Leu Gly Glu 
250 255 260 

GCC CAA CTA GGC AAT AGC TCT GGC AAT TTC TTG TTG CCC GAC GCG CAA 98 8 

Ala Gin Leu Gly Asn Ser Ser Gly Asn Phe Leu Leu Pro Asp Ala Gin 
265 270 275 

AGC ATT CAG GCC GCG GCG GCT GGC TTC GCA TCG AAA ACC CCG GCG AAC 103 6 

Ser lie Gin Ala Ala Ala Ala Gly Phe Ala Ser Lys Thr Pro Ala Asn 
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280 285 290 295 

CAG GCG ATT TCG ATG ATC GAC GGG CCC GCC CCG GAC GGC TAC CCG ATC 
Gin Ala lie Ser Met He Asp Gly Pro Ala Pro Asp Gly Tyr Pro He 
300 305 310 

ATC AAC TAC GAG TAC GCC ATC GTC AAC AAC CGG CAA AAG GAC GCC GCC 1132 
He Asn Tyr Glu Tyr Ala He Val Asn Asn Arg Gin Lys Asp Ala Ala 
315 320 325 

ACC GCG CAG ACC TTG CAG GCA TTT CTG CAC TGG GCG ATC ACC GAC GGC 118 0 

Thr Ala Gin Thr Leu Gin Ala Phe Leu His Trp Ala He Thr Asp Gly 
330 335 340 

AAC AAG GCC TCG TTC CTC GAC CAG GTT CAT TTC CAG CCG CTG CCG CCC 122 8 

Asn Lys Ala Ser Phe Leu Asp Gin Val His Phe Gin Pro Leu Pro Pro 
345 350 355 

GCG GTG GTG AAG TTG TCT GAC GCG TTG ATC GCG ACG ATT TCC AGC 127 3 
Ala Val Val Lys Leu Ser Asp Ala Leu He Ala Thr He Ser Ser 
360 365 370 

TAGCCTCGTT GACCACCACG CGACAGCAAC CTCCGTCGGG CCATCGGGCT GCTTTGCGGA 1333 

GCATGCTGGC CCGTGCCGGT GAAGTCGGCC GCGCTGGCCC GGCCATCCGG TGGTTGGGTG 1393 

GGATAGGTGC GGTGATCCCG CTGCTTGCGC TGGTCTTGGT GCTGGTGGTG CTGGTCATCG 14 53 

AGGCGATGGG TGCGATCAGG CTCAACGGGT TGCATTTCTT CACCGCCACC GAATGGAATC 1513 

CAGGCAACAC CTACGGCGAA ACCGTTGTCA CCGACGCGTC GCCCATCCGG TCGGCGCCTA 157 3 

CTACGGGGCG TTGCCGCTGA TCGTCGGGAC GCTGGCGACC TCGGCAATCG CCCTGATCAT 1633 

CGCGGTGCCG GTCTCTGTAG GAGCGGCGCT GGTGATCGTG GAACGGCTGC CGAAACGGTT 1693 

GGCCGAGGCT GTGGGAATAG TCCTGGAATT GCTCGCCGGA ATCCCCAGCG TGGTCGTCGG 17 5 3 

TTTGTGGGGG GCAATGACGT TCGGGCCGTT CATCGCTCAT CACATCGCTC CGGTGATCGC 1813 

TCACAACGCT CCCGATGTGC CGGTGCTGAA CTACTTGCGC GGCGACCCGG GCAACGGGGA 187 3 

GGGCATGTTG GTGTCCGGTC TGGTGTTGGC GGTGATGGTC GTTCCCATTA TCGCCACCAC 1933 

CACTCATGAC CTGTTCCGGC AGGTGCCGQT GTTGCCCCGG GAGGGCGCGA TCGGGAATTC 1993 



(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 374 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:148: 

Val Lys lie Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
15 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
65 70 * 75 80 

lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly lie Ala Gin Ala Ala 
85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 110 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
115 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 140 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He Lys Thr 
145 150 155 160 

Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 • 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 235 240 

Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp Gin Ala 
245 250 255 

Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 270 
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Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala Gly Phe 
275 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He Val Asn 
305 310 315 320 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 

His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 
340 345 350 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp Ala Leu 
355 360 365 

He Ala Thr He Ser Ser 
370 

(2) INFORMATION FOR SEQ ID NO: 14 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 4 9 : 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 12 0 

GCGGAAATTG AAGAGCACAG AAAGGTATGG CGTGAAAATT CGTTTGCATA CGCTGTTGGC 18 0 

CGTGTTGACC GCTGCGCCGC TGCTGCTAGC AGCGGCGGGC TGTGGCTCGA AACCACCGAG 24 0 

CGGTTCGCCT GAAACGGGCG CCGGCGCCGG TACTGTCGCG ACTACCCCCG CGTCGTCGCC 300 

GGTGACGTTG GCGGAGACCG GTAGCACGCT GCTCTACCCG CTGTTCAACC TGTGGGGTCC 3 60 

GGCCTTTCAC GAGAGGTATC CGAACGTCAC GATCACCGCT CAGGGCACCG GTTCTGGTGC 42 0 

CGGGATCGCG CAGGCCGCCG CCGGGACGGT CAACATTGGG GCCTCCGACG CCTATCTGTC 4 80 

GGAAGGTGAT ATGGCCGCGC ACAAGGGGCT GATGAACATC GCGCTAGCCA TCTCCGCTCA 54 0 

GCAGGTCAAC TACAACCTGC CCGGAGTGAG CGAGCACCTC AAGCTGAACG GAAAAGTCCT 600 



WO 98/16645 



218 



PCT/US97/18214 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 
GGATATCTGC AGAATTCAGG TTTAAAGCCC ATTTGCGA 38 
(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 
CCGCATGCGA GCCACGTGCC CACAACGGCC 30 
(2) INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 
CTTCATGGAA TTCTCAGGCC GGTAAGGTCC GCTGCGG 37 
(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7676 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:208: 
TGGCGAATGG GACGCGCCCT GTAGCGGCGC ATTAAGCGCG GCGGGTGTGG TGGTTACGCG 



60 
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CAGCGTGACC GCTACACTTG CCAGCGCCCT AGCGCCCGCT CCTTTCGCTT TCTTCCCTTC 12 0 

CTTTCTCGCC ACGTTCGCCG GCTTTCCCCG TCAAGCTCTA AATCGGGGGC TCCCTTTAGG 180 

GTTCCGATTT AGTGCTTTAC GGCACCTCGA CCCCAAAAAA CTTGATTAGG GTGATGGTTC 24 0 

ACGTAGTGGG CCATCGCCCT GATAGACGGT TTTTCGCCCT TTGACGTTGG AGTCCACGTT 300 

CTTTAATAGT GGACTCTTGT TCCAAACTGG AACAACACTC AACCCTATCT CGGTCTATTC 360 

TTTTGATTTA TAAGGGATTT TGCCGATTTC GGCCTATTGG TTAAAAAATG AGCTGATTTA 420 

ACAAAAATTT AACGCGAATT TTAACAAAAT ATTAACGTTT ACAATTTCAG GTGGCACTTT 4 80 

TCGGGGAAAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT CAAATATGTA 54 0 

TCCGCTCATG AATTAATTCT TAGAAAAACT CATCGAGCAT CAAATGAAAC TGCAATTTAT 600 

TCATATCAGG ATTATCAATA CCATATTTTT GAAAAAGCCG TTTCTGTAAT GAAGGAGAAA 6 60 

ACTCACCGAG GCAGTTCCAT AGGATGGCAA GATCCTGGTA TCGGTCTGCG ATTCCGACTC 720 

GTCCAACATC AATACAACCT ATTAATTTCC CCTCGTCAAA AATAAGGTTA TCAAGTGAGA 78 0 

AAT C AC CAT G AGTGACGACT GAATCCGGTG AGAATGGCAA AAGTTTATGC ATTTCTTTCC 84 0 

AGACTTGTTC AACAGGCCAG CCATTACGCT CGTCATCAAA ATCACTCGCA TCAACCAAAC 900 

CGTTATTCAT TCGTGATTGC GCCTGAGCGA GACGAAATAC GCGATCGCTG TTAAAAGGAC 9 60 

AATTACAAAC AGGAATCGAA TGCAACCGGC GCAGGAACAC TGCCAGCGCA T C AAC AAT AT 1020 

TTTCACCTGA AT C AG GAT AT TCTTCTAATA CCTGGAATGC TGTTTTCCCG GGGATCGCAG 108 0 

TGGTGAGTAA CCATGCATCA TCAGGAGTAC GGATAAAATG CTTGATGGTC GGAAGAGGCA 114 0 

TAAATTCCGT CAGCCAGTTT AGTCTGACCA TCTCATCTGT AACATCATTG GCAACGCTAC 1200 

CTTTGCCATG TTTCAGAAAC AACTCTGGCG CATCGGGCTT CCCATACAAT CGATAGATTG 12 60 

TCGCACCTGA TTGCCCGACA TTATCGCGAG CCCATTTATA CCCATATAAA TCAGCATCCA 1320 

TGTTGGAATT TAATCGCGGC CTAGAGCAAG ACGTTTCCCG TTGAATATGG CTCATAACAC 138 0 

CCCTTGTATT ACTGTTTATG TAAGCAGACA GTTTTATTGT TCATGACCAA AATCCCTTAA 14 4 0 

CGTGAGTTTT CGTTCCACTG AGCGTCAGAC CCCGTAGAAA AGATCAAAGG ATCTTCTTGA 15 00 

GATCCTTTTT TTCTGCGCGT AATCTGCTGC TTGCAAACAA AAAAACCACC GCTACCAGCG 15 60 

GTGGTTTGTT TGCCGGATCA AGAGCTACCA ACTCTTTTTC CGAAGGTAAC TGGCTTCAGC 1620 

AGAGCGCAGA TACCAAATAC TGTCCTTCTA GTGTAGCCGT AGTTAGGCCA CCACTTCAAG 168 0 



WO 98/16645 



PCT/US97/18214 



220 

AACTCTGTAG CACCGCCTAC ATACCTCGCT CTGCTAATCC TGTTACCAGT GGCTGCTGCC 17 4 0 

AGTGGCGATA AGTCGTGTCT TACCGGGTTG GACTCAAGAC GATAGTTACC GGATAAGGCG 1800 

CAGCGGTCGG GCTGAACGGG GGGTTCGTGC ACACAGCCCA GCTTGGAGCG AACGACCTAC 18 60 

ACCGAACTGA GATACCTACA GCGTGAGCTA TGAGAAAGCG CCACGCTTCC CGAAGGGAGA 1920 

AAGGCGGACA GGTATCCGGT AAGCGGCAGG GTCGGAACAG GAGAGCGCAC GAGGGAGCTT 198 0 

CCAGGGGGAA ACGCCTGGTA TCTTTATAGT CCTGTCGGGT TTCGCCACCT CTGACTTGAG 204 0 

CGTCGATTTT TGTGATGCTC GTCAGGGGGG CGGAGCCTAT GGAAAAACGC CAGCAACGCG 2100 

GCCTTTTTAC GGTTCCTGGC CTTTTGCTGG CCTTTTGCTC ACATGTTCTT TCCTGCGTTA 2160 

TCCCCTGATT CTGTGGATAA CCGTATTACC GCCTTTGAGT GAGCTGATAC CGCTCGCCGC 222 0 

AGCCGAACGA CCGAGCGCAG CGAGTCAGTG AGCGAGGAAG CGGAAGAGCG CCTGATGCGG 2280 

TATTTTCTCC TTACGCATCT GTGCGGTATT TCACACCGCA TATATGGTGC ACTCTCAGTA 234 0 

CAATCTGCTC TGATGCCGCA TAGTTAAGCC AGTATACACT CCGCTATCGC TACGTGACTG 2 4 00 

GGTCATGGCT GCGCCCCGAC ACCCGCCAAC ACCCGCTGAC GCGCCCTGAC GGGCTTGTCT 2 4 60 

GCTCCCGGCA TCCGCTTACA GACAAGCTGT GACCGTCTCC GGGAGCTGCA TGTGTCAGAG 2520 

GTTTTCACCG TCATCACCGA AACGCGCGAG GCAGCTGCGG TAAAGCTCAT CAGCGTGGTC 2 58 0 

GTGAAGCGAT TCACAGATGT CTGCCTGTTC ATCCGCGTCC AGCTCGTTGA GTTTCTCCAG 2 64 0 

AAGCGTTAAT GTCTGGCTTC TGATAAAGCG GGCCATGTTA AGGGCGGTTT TTTCCTGTTT 2700 

GGTCACTGAT GCCTCCGTGT AAGGGGGATT TCTGTTCATG GGGGTAATGA TACCGATGAA 27 60 

AC G AG AG AG G ATGCTCACGA TACGGGTTAC TGATGATGAA CATGCCCGGT TACTGGAACG 2820 

TTGTGAGGGT AAACAACTGG CGGTATGGAT GCGGCGGGAC CAGAGAAAAA TCACTCAGGG 28 8 0 

TCAATGCCAG CGCTTCGTTA ATACAGATGT AGGTGTTCCA CAGGGTAGCC AGCAGCATCC 2 94 0 

TGCGATGCAG ATCCGGAACA TAATGGTGCA GGGCGCTGAC TTCCGCGTTT C C AG AC T T T A 3000 

CGAAACACGG AAACCGAAGA CCATTCATGT TGTTGCTCAG GTCGCAGACG TTTTGCAGCA 30 60 

GCAGTCGCTT CACGTTCGCT CGCGTATCGG TGATTCATTC TGCTAACCAG TAAGGCAACC 3120 

CCGCCAGCCT AGCCGGGTCC TCAACGACAG GAGCACGATC ATGCGCACCC GTGGGGCCGC 3180 

CATGCCGGCG ATAATGGCCT GCTTCTCGCC GAAACGTTTG GTGGCGGGAC CAGTGACGAA 324 0 

GGCTTGAGCG AGGGCGTGCA AGATTCCGAA TACCGCAAGC GACAGGCCGA TCATCGTCGC 3300. 

GCTCCAGCGA AAGCGGTCCT CGCCGAAAAT GACCCAGAGC GCTGCCGGCA CCTGTCCTAC 3360 
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GAGTTGCATG ATAAAGAAGA CAGTCATAAG TGCGGCGACG ATAGTCATGC CCCGCGCCCA 34 2 0 

CCGGAAGGAG CTGACTGGGT TGAAGGCTCT CAAGGGCATC GGTCGAGATC CCGGTGCCTA 34 80 

ATGAGTGAGC TAACTTACAT TAATTGCGTT GCGCTCACTG CCCGCTTTCC AGTCGGGAAA 354 0 

CCTGTCGTGC CAGCTGCATT AATGAATCGG CCAACGCGCG GGGAGAGGCG GTTTGCGTAT 3 600 

TGGGCGCCAG GGTGGTTTTT CTTTTCACCA GTGAGACGGG CAACAGCTGA TTGCCCTTCA 3 6 60 

CCGCCTGGCC CTGAGAGAGT TGCAGCAAGC GGTCCACGCT GGTTTGCCCC AG C AG G C G AA 37 20 

AATCCTGTTT GATGGTGGTT AACGGCGGGA TATAACATGA GCTGTCTTCG GTATCGTCGT 37 8 0 

ATCCCACTAC CGAGATATCC GCACCAACGC GCAGCCCGGA CTCGGTAATG GCGCGCATTG 38 4 0 

CGCCCAGCGC CATCTGATCG TTGGCAACCA GCATCGCAGT GGGAACGATG CCCTCATTCA 3 900 

GCATTTGCAT GGTTTGTTGA AAACCGGACA TGGCACTCCA GTCGCCTTCC CGTTCCGCTA 3 9 60 

TCGGCTGAAT TTGATTGCGA GTGAGATATT TATGCCAGCC AGCCAGACGC AGACGCGCCG 4 020 

AGACAGAACT TAATGGGCCC GCTAACAGCG CGATTTGCTG GTGACCCAAT GCGACCAGAT 4 0 80 

GCTCCACGCC CAGTCGCGTA CCGTCTTCAT GGGAGAAAAT AATACTGTTG ATGGGTGTCT 414 0 

GGTCAGAGAC ATCAAGAAAT AACGCCGGAA CATTAGTGCA GGCAGCTTCC ACAGCAATGG 4 200 

CATCCTGGTC ATCCAGCGGA TAGTTAATGA TCAGCCCACT GACGCGTTGC GCGAGAAGAT 4 2 60 

TGTGCACCGC CGCTTTACAG GCTTCGACGC CGCTTCGTTC TACCATCGAC ACCACCACGC 4 320 

TGGCACCCAG TTGATCGGCG CGAGATTTAA TCGCCGCGAC AATTTGCGAC GGCGCGTGCA 4 380 

GGGCCAGACT GGAGGTGGCA ACGCCAATCA GCAACGACTG TTTGCCCGCC AGTTGTTGTG 4 4 40 

CCACGCGGTT GGGAATGTAA TTCAGCTCCG CCATCGCCGC TTCCACTTTT TCCCGCGTTT 4 500 

TCGCAGAAAC GTGGCTGGCC TGGTTCACCA CGCGGGAAAC GGTCTGATAA GAGACACCGG 4 560 

CATACTCTGC GACATCGTAT AACGTTACTG GTTTCACATT CACCACCCTG AATTGACTCT 4 620 

CTTCCGGGCG CTATCATGCC ATACCGCGAA AGGTTTTGCG CCATTCGATG GTGTCCGGGA 4 68 0 

TCTCGACGCT CTCCCTTATG CGACTCCTGC ATTAGGAAGC AGCCCAGTAG TAGGTTGAGG 4 74 0 

CCGTTGAGCA CCGCCGCCGC AAGGAATGGT GCATGCAAGG AGATGGCGCC CAACAGTCCC 4 8 00 

CCGGCCACGG GGCCTGCCAC CATACCCACG CCGAAACAAG CGCTCATGAG CCCGAAGTGG 4 8 60 

CGAGCCCGAT CTTCCCCATC GGTGATGTCG GCGATATAGG CGCCAGCAAC CGCACCTGTG 4 92 0 

GCGCCGGTGA TGCCGGCCAC GATGCGTCCG GCGTAGAGGA TCGAGATCTC GATCCCGCGA 4 98 0 
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AATTAATACG AC TC AC TATA GGGGAATTGT GAGCGGATAA CAATTCCCCT CTAGAAATAA 504 0 

TTTTGTTTAA CTTTAAGAAG GAGATATACA TATGGGCCAT CATCATCATC ATCACGTGAT 5100 

CGACATCATC GGGACCAGCC CCACATCCTG GGAACAGGCG GCGGCGGAGG CGGTCCAGCG 5160 

GGCGCGGGAT AGCGTCGATG ACATCCGCGT CGCTCGGGTC ATTGAGCAGG ACATGGCCGT 5220 

GGACAGCGCC GGCAAGATCA CCTACCGCAT CAAGCTCGAA GTGTCGTTCA AGATGAGGCC 528 0 

GGCGCAACCG AGGGGCTCGA AACCACCGAG CGGTTCGCCT GAAACGGGCG CCGGCGCCGG 534 0 

TACTGTCGCG ACTACCCCCG CGTCGTCGCC GGTGACGTTG GCGGAGACCG GTAGCACGCT 54 00 

GCTCTACCCG CTGTTCAACC TGTGGGGTCC GGCCTTTCAC GAGAGGTATC CGAACGTCAC 54 60 

GATCACCGCT CAGGGCACCG GTTCTGGTGC CGGGATCGCG CAGGCCGCCG CCGGGACGGT 552 0 

CAACATTGGG GCCTCCGACG CCTATCTGTC GGAAGGTGAT ATGGCCGCGC ACAAGGGGCT 558 0 

GATGAACATC GCGCTAGCCA TCTCCGCTCA GCAGGTCAAC TACAACCTGC CCGGAGTGAG 5 64 0 

CGAGCACCTC AAGCTGAACG GAAAAGTCCT GGCGGCCATG TACCAGGGCA C CAT C AAAAC 57 00 

CTGGGACGAC CCGCAGATCG CTGCGCTCAA CCCCGGCGTG AACCTGCCCG GCACCGCGGT 57 60 

AGTTCCGCTG CACCGCTCCG ACGGGTCCGG TGACACCTTC TTGTTCACCC AGTACCTGTC 58 2 0 

CAAGCAAGAT CCCGAGGGCT GGGGCAAGTC GCCCGGCTTC GGCACCACCG TCGACTTCCC 588 0 

GGCGGTGCCG GGTGCGCTGG GTGAGAACGG CAACGGCGGC ATGGTGACCG GTTGCGCCGA 5 94 0 

GACACCGGGC TGCGTGGCCT ATATCGGCAT CAGCTTCCTC GACCAGGCCA GTCAACGGGG 600 0 

ACTCGGCGAG GCCCAACTAG GCAATAGCTC TGGCAATTTC TTGTTGCCCG ACGCGCAAAG 6060 

CATTCAGGCC GCGGCGGCTG GCTTCGCATC GAAAACCCCG GCGAACCAGG CGATTTCGAT 6120 

GATCGACGGG CCCGCCCCGG ACGGCTACCC GATCATCAAC TACGAGTACG CCATCGTCAA 618 0 

CAACCGGCAA AAGGACGCCG CCACCGCGCA GACCTTGCAG GCATTTCTGC ACTGGGCGAT 62 4 0 

CACCGACGGC AACAAGGCCT CGTTCCTCGA CCAGGTTCAT TTCCAGCCGC TGCCGCCCGC 6300 

GGTGGTGAAG TTGTCTGACG CGTTGATCGC GACGATTTCC AGCGCTGAGA TGAAGACCGA 63 60 

TGCCGCTACC CTCGCGCAGG AGGCAGGTAA TTTCGAGCGG ATCTCCGGCG AC C T G AAAAC 64 20 

CCAGATCGAC CAGGTGGAGT CGACGGCAGG TTCGTTGCAG GGCCAGTGGC GCGGCGCGGC 64 8 0 

GGGGACGGCC GCCCAGGCCG CGGTGGTGCG CTTCCAAGAA GCAGCCAATA AGCAGAAGCA 654 0 

GGAACTCGAC GAGATCTCGA CGAATATTCG TCAGGCCGGC GTCCAATACT CGAGGGCCGA 6600. 

CGAGGAGCAG CAGCAGGCGC TGTCCTCGCA AATGGGCTTT GTGCCCACAA CGGCCGCCTC 6660 
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GCCGCCGTCG 


ACCGCTGCAG 


CGCCACCCGC 


ACCGGCGACA 


CCTGTTGCCC 


CCCCACCACC 


6720 


GGCCGCCGCC 


AACACGCCGA 


ATGCCCAGCC 


GGGCGATCCC 


AACGCAGCAC 


CTCCGCCGGC 


6780 


CGACCCGAAC 


GCACCGCCGC 


CACCTGTCAT 


TGCCCCAAAC 


GCACCCCAAC 


CTGTCCGGAT 


6840 


CGACAACCCG 


GTTGGAGGAT 


TCAGCTTCGC 


GCTGCCTGCT 


GGCTGGGTGG 


AGTCTGACGC 


6900 


CGCCCACTTC 


GACTACGGTT 


CAGCACTCCT 


CAGCAAAACC 


ACCGGGGACC 


CGCCATTTCC 


6960 


CGGACAGCCG 


CCGCCGGTGG 


CCAATGACAC 


CCGTATCGTG 


CTCGGCCGGC 


TAGACCAAAA 


7020 


GCTTTACGCC 


AGCGCCGAAG 


CCACCGACTC 


CAAGGCCGCG 


GCCCGGTTGG 


GCTCGGACAT 


7080 


GGGTGAGTTC 


TATATGCCCT 


ACCCGGGCAC 


CCGGATCAAC 


CAGGAAACCG 


TCTCGCTTGA 


7140 


CGCCAACGGG 


GTGTCTGGAA 


GCGCGTCGTA 


TTACGAAGTC 


AAGTTCAGCG 


ATCCGAGTAA 


7200 


GCCGAACGGC 


CAGATCTGGA 


CGGGCGTAAT 


CGGCTCGCCC 


GCGGCGAACG 


CACCGGACGC 


7260 


CGGGCCCCCT 


CAGCGCTGGT 


TTGTGGTATG 


GCTCGGGACC 


GCCAACAACC 


CGGTGGACAA 


7320 


GGGCGCGGCC 


AAGGCGCTGG 


CCGAATCGAT 


CCGGCCTTTG 


GTCGCCCCGC 


CGCCGGCGCC 


7380 


GGCACCGGCT 


CCTGCAGAGC 


CCGCTCCGGC 


GCCGGCGCCG 


GCCGGGGAAG 


TCGCTCCTAC 


7440 


CCCGACGACA 


CCGACACCGC 


AGCGGACCTT 


ACCGGCCTGA 


GAATTCTGCA 


GATATCCATC 


7500 


ACACTGGCGG 


CCGCTCGAGC 


ACCACCACCA 


CCACCACTGA 


GATCCGGCTG 


CTAACAAAGC 


7560 


CCGAAAG'GAA 


GCTGAGTTGG 


CTGCTGCCAC 


CGCTGAGCAA 


TAACTAGCAT 


AACCCCTTGG 


7620 


GGCCTCTAAA 


CGGGTCTTGA 


GGGGTTTTTT 


GCTGAAAGGA 


GGAACTATAT 


CCGGAT 


7676 



(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 802 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209: 

Met Gly His His His His His His Val lie Asp He He Gly Thr Ser 
15 10 15 

Pro Thr Ser Trp Glu Gin Ala Ala Ala Glu Ala Val Gin Arg Ala Arg 
20 25 30 
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Asp Ser Val Asp Asp lie Arg Val Ala Arg Val lie Glu Gin Asp Met 
35 40 45 

Ala Val Asp Ser Ala- Gly Lys lie Thr Tyr Arg lie Lys Leu Glu Val 
50 55 60 

Ser Phe Lys Met Arg Pro Ala Gin Pro Arg Gly Ser Lys Pro Pro Ser 
65 70 75 80 

Gly Ser Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro 
85 90 95 

Ala Ser Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr 
100 105 110 

Pro Leu Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn 
115 120 125 . 

Val Thr lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly lie Ala Gin 
130 135 140 

Ala Ala Ala Gly Thr Val Asn lie Gly Ala Ser Asp Ala Tyr Leu Ser 
145 150 155 160 

Glu Gly Asp Met Ala Ala His Lys Gly Leu Met Asn lie Ala Leu Ala 
165 170 175 

lie Ser Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His 
180 185 190 

Leu Lys Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr lie 
195 200 205 

Lys Thr Trp Asp Asp Pro Gin lie Ala Ala Leu Asn Pro Gly Val Asn 
210 215 220 

Leu Pro Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly 
225 230 235 240 

Asp Thr Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly 
245 250 255 

Trp Gly Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val 
260 265 270 

Pro Gly Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys 
275 280 285 

Ala Glu Thr Pro Gly Cys Val Ala Tyr lie Gly lie Ser Phe Leu Asp 
290 295 300 

Gin Ala Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser 
305 310 315 320 



Gly Asn Phe Leu Leu Pro Asp Ala Gin Ser lie Gin Ala Ala Ala Ala 
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325 



330 



335 



Gly Phe Ala Ser Lys Thr Pro Ala Asn Gin Ala lie Ser Met lie Asp 
340 345 350 

Gly Pro Ala Pro Asp Gly Tyr Pro lie lie Asn Tyr Glu Tyr Ala lie 
355 360 365 

Val Asn Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala 



Phe Leu His Trp Ala lie Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp 
385 390 395 400 

Gin Val His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp 
405 410 415 

Ala Leu lie Ala Thr lie Ser Ser Ala Glu Met Lys Thr Asp Ala Ala 
420 425 430 

Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg lie Ser Gly Asp Leu 
435 440 445 

Lys Thr Gin lie Asp Gin Val Glu Ser Thr Ala Gly Ser Leu Gin Gly 
450 455 460 

Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala Ala Val Val Arg 
465 470 475 480 

Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu lie Ser 
485 490 495 

Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg Ala Asp Glu Glu 
500 505 510 

Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe Val Pro Thr Thr Ala 
515 520 525 

Ala Ser Pro Pro Ser Thr Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro 
530 535 540 

Val Ala Pro Pro Pro Pro Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro 
545 550 555 560 

Gly Asp Pro Asn Ala Ala Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro 
565 570 575 

Pro Pro Val He Ala Pro Asn Ala Pro Gin Pro Val Arg lie Asp Asn 
580 585 590 

Pro Val Gly Gly Phe Ser Phe Ala Leu Pro Ala Gly Trp Val Glu Ser 
595 600 605 

Asp Ala Ala His Phe Asp Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr 
610 615 620 



370 
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Gly Asp Pro Pro Phe Pro Gly Gin Pro Pro Pro Val Ala Asn Asp Thr 
625 630 635 640 

Arg lie Val Leu Gly Arg Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu 
645 650 655 

Ala Thr Asp Ser Lys Ala Ala Ala Arg Leu Gly Ser Asp Met Gly Glu 
660 665 670 

Phe Tyr Met Pro Tyr Pro Gly Thr Arg lie Asn Gin Glu Thr Val Ser 
675 680 685 

Leu Asp Ala Asn Gly Val Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys 
690 695 700 

Phe Ser Asp Pro Ser Lys Pro Asn Gly Gin lie Trp Thr Gly Val lie 
705 710 715 720 

Gly Ser Pro Ala Ala Asn Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp 
725 730 735 

Phe Val Val Trp Leu Gly Thr Ala Asn Asn Pro Val Asp Lys Gly Ala 
740 745 750 

Ala Lys Ala Leu Ala Glu Ser lie Arg Pro Leu Val Ala Pro Pro Pro 
755 760 765 

Ala Pro Ala Pro Ala Pro Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala 
770 775 780 



Gly Glu Val Ala Pro Thr Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu 
785 790 795 800 



Pro Ala 
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CLAIMS 

We claim: 

1 . A polypeptide comprising an antigenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 
Val-Val- Ala-Ala-Leu (SEQ ID NO: 1 1 5); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQ ID NO: 1 16); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg (SEQ ID NO: 17); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 
(SEQ ID NO: 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val (SEQ ID 
NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 120); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 
Ser (SEQ ID NO: 121); 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 
(SEQ ID NO: 122); 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn (SEQ 
ID NO: 123); and 

(j) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 
(SEQ ID NO: 131) 
wherein Xaa may be any amino acid. 
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2. A polypeptide comprising an immunogenic portion of an 
M tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa«Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 124) and 

(b) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys-Ile- 
Asn-Val-His-Leu-Val; (SEQ ID NO: 132), wherein Xaa may be any 
amino acid. 

3. A polypeptide comprising an antigenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, the complements of said sequences, and DNA 
sequences that hybridize to a sequence recited in SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 
96 or a complement thereof under moderately stringent conditions. 

4. A polypeptide comprising an antigenic portion of a M. tuberculosis 
antigen, or a variant of said antigen that differs only in conservative substitutions and/or 
modifications, wherein said antigen comprises an amino acid sequence encoded by a DNA 
sequence selected from the group consisting of the sequences recited in SEQ ID NOS: 26-51, 
133, 134, 158-178 and 196, the complements of said sequences, and DNA sequences that 
hybridize to a sequence recited in SEQ ID NOS: 26-51, 133, 134, 158-178 and 196 or a 
complement thereof under moderately stringent conditions. 

5. A DNA molecule comprising a nucleotide sequence encoding a 
polypeptide according to any one of claims 1-4. 
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6. A recombinant expression vector comprising a DNA molecule 
according to claim 5. 

7. A host cell transformed with an expression vector according to claim 6. 

8. The host cell of claim 7 wherein the host cell is selected from the group 
consisting of E. coli y yeast and mammalian cells. 



9. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with one or more polypeptides 
according to any of claims 1-4; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M. tuberculosis infection in the biological sample. 

10. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with a polypeptide having an N- 
terminal sequence selected from the group consisting of sequences provided in SEQ ID NO: 
129 and 130; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M. tuberculosis infection in the biological sample. 

11. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with one or more polypeptides encoded 
by a DNA sequence selected from the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 
151-155, 184-188, 194-195 and 198, the complements of said sequences, and DNA sequences 
that hybridize to a sequence recited in SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 
194-195 and 198; and 
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(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M. tuberculosis infection in the biological sample. 

12. The method of any one of claims 9-11 wherein step (a) additionally 
comprises contacting the biological sample with a 38 kD M. tuberculosis antigen and step (b) 
additionally comprises detecting in the sample the presence of antibodies that bind to the 
38 kD M tuberculosis antigen. 

13. The method of any one of claims 9-1 1 wherein the polypeptide(s) are 
bound to a solid support. 

14. The method of claim 13 wherein the solid support comprises 
nitrocellulose, latex or a plastic material. 

15. The method of any one of claims 9-1 1 wherein the biological sample is 
selected from the group consisting of whole blood, serum, plasma, saliva, cerebrospinal fluid 
and urine. 

16. The method of claim 15 wherein the biological sample is whole blood 

or serum. 

17. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with at least two oligonucleotide primers in a 
polymerase chain reaction, wherein at least one of the oligonucleotide primers is specific for a 
DNA molecule according to claim 5; and 

(b) detecting in the sample a DNA sequence that amplifies in the presence 
of the oligonucleotide primers, thereby detecting M. tuberculosis infection. 



o 



o 



WO 98/16645 PCT/US97/18214 

231 

18. The method of claim 17, wherein at least one of the oligonucleotide 
primers comprises at least about 10 contiguous nucleotides of a DNA molecule according to 
claim 5. 

19. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with at least two oligonucleotide primers in a 
polymerase chain reaction, wherein at least one of the oligonucleotide primers is specific for a 
DNA sequence selected from the group consisting of SEQ ID NOS: 3, 1 1, 12, 135, 136, 151- 
155, 184-188, 194-195 and 198; and 

(b) detecting in the sample a DNA sequence that amplifies in the presence 
of the first and second oligonucleotide primers, thereby detecting M tuberculosis infection. 

20. The method of claim 1 9, wherein at least one of the oligonucleotide 
primers comprises at least about 10 contiguous nucleotides of a DNA sequence selected from 
the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 
198. 

21. The method of claims 17 or 19 wherein the biological sample is 
selected from the group consisting of whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid and urine. 

22. A method for detecting M tuberculosis infection in a biological 

sample, comprising: 

(a) contacting the sample with one or more oligonucleotide probes specific 

for a DNA molecule according to claim 5; and 

(b) detecting in the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting M. tuberculosis infection. 
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23. The method of claim 22 wherein the probe comprises at least about 15 
contiguous nucleotides of a DNA molecule according to claim 5. 

24. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with one or more oligonucleotide probes specific 
for a DNA sequence selected from the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 
151-155, 184-188, 194-195 and 198; and 

(b) detecting in the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting M. tuberculosis infection. 

25. The method of claim 24 wherein the oligonucleotide probe comprises 
at least about 15 contiguous nucleotides of a DNA sequence selected from the group 
consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

26. The method of claims 22 or 24 wherein the biological sample is 
selected from the group consisting of whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid and urine. 

27. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide according to any one of claims 1-4; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M. tuberculosis infection in the biological sample. 

28. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 
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(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide having an N-terminal sequence selected from the group consisting 
of sequences provided in SEQ ID NO: 129 and 130; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M tuberculosis infection in the biological sample. 

29. A method for detecting M. tuberculosis infection in a biological 

sample, comprising: 

(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide encoded by a DNA sequence selected from the group consisting 
of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198, the complements 
of said sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID 
NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M tuberculosis infection in the biological sample. 

30. The method of any one of claims 27-29 wherein the binding agent is a 
monoclonal antibody. 

3 1 . The method of any one of claims 27-29 wherein the binding agent is a 
polyclonal antibody. 



32. A diagnostic kit comprising: 

(a) one or more polypeptides according to any of claims 1-4; and 

(b) a detection reagent. 



33 . A diagnostic kit comprising: 

(a) one or more polypeptides having an N-terminal sequence selected from 
the group consisting of sequences provided in SEQ ID NO: 129 and 130; and 

(b) a detection reagent. 
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34. A diagnostic kit comprising: 

(a) one or more polypeptides encoded by a DNA sequence selected from 
the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 
198, the complements of said sequences, and DNA sequences that hybridize to a sequence 
recited in SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198; and 

(b) a detection reagent. 

35. The kit of any one of claims 32-34 wherein the polypeptide(s) are 
immobilized on a solid support. 

36. The kit of claim 35 wherein the solid support comprises nitrocellulose, 
latex or a plastic material. 

37. The kit of any one of claims 32-34 wherein the detection reagent 
comprises a reporter group conjugated to a binding agent. 

38. The kit of claim 37 wherein the binding agent is selected from the 
group consisting of anti-immunoglobulins, Protein G, Protein A and lectins. 

39. The kit of claim 37 wherein the reporter group is selected from the 
group consisting of radioisotopes, fluorescent groups, luminescent groups, enzymes, biotin 
and dye particles. 

40. A diagnostic kit comprising at least two oligonucleotide primers, at 
least one of the oligonucleotide primers being specific for a DNA molecule according to 
claim 5. 
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41. A diagnostic kit according to claim 40, wherein at least one of the 
oligonucleotide primers comprises at least about 10 contiguous nucleotide of a DNA 
molecule according to claim 5. 

42. A diagnostic kit comprising a at least two oligonucleotide primers, at 
least one of the primers being specific for a DNA sequence selected from the group consisting 
of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

43. A diagnostic kit according to claim 42, wherein at least one of the 
oligonucleotide primers comprises at least about 10 contiguous nucleotide of a DNA 
sequence selected from the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 
184-188, 194-195 and 198. 

44. A diagnostic kit comprising at least one oligonucleotide probe, the 
oligonucleotide probe being specific for a DNA molecule according to claim 5. 

45. A kit according to claim 44, wherein the oligonucleotide probe 
comprises at least about 15 contiguous nucleotides of a DNA molecule according to claim 5. 

46. A diagnostic kit comprising at least one oligonucleotide probe, the 
oligonucleotide probe being specific for a DNA sequence selected from the group consisting 
of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

47. A kit according to claim 46, wherein the oligonucleotide probe 
comprises at least about 15 contiguous nucleotides of a DNA sequence selected from the 
group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

48. A monoclonal antibody that binds to a polypeptide according to any of 

claims 1-4. 
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49. A polyclonal antibody that binds to a polypeptide according to any of 

claims 1-4. 

50. A fusion protein comprising two or more polypeptides according to 
any one of claims 1-4. 

51. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and ESAT-6 (SEQ ID NO: 99). 

52. A fusion protein comprising a polypeptide having an N-terminal 
sequence selected from the group of sequences provided in SEQ ID NOS: 129 and 130. 

53. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and the M tuberculosis antigen 38 kD (SEQ ID NO: 150). 



54. A diagnostic kit comprising: 

(a) one or more fusion proteins according to any one of claims 50-53; and 

(b) a detection reagent. 
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